Showing posts with label String parsing. Show all posts
Showing posts with label String parsing. Show all posts

Saturday, May 10, 2008

Open Source JavaScript Compressor

Are you concerned about having your Firefox Extension JavaScript code compromised? XPI files can be extracted and code can be easily viewed. One solution that works very well for many organizations that wish to keep their JavaScript code secret is to use a JavaScript obfuscator.

Also known as a script compiler or script compressor, an obfuscator takes human-readable JavaScript code and converts it into text that is virtually impossible for humans to process. When you're ready to deploy your product live, your developers can "obfuscate" a version of the code for distribution and maintain the original human-readable version for continued maintenance and development.

In addition to making it difficult to reverse-engineer, compressed JavaScript files are generally 40% to 60% smaller than their aesthetically pleasing human-readable counterparts, as a result of comment, whitespace, and line break removal.

http://www.javascript-source.com/

Please see the above link for a quick example of the difference between a human-readable JavaScript function and an obfucscated function. I wouldn't recommend purchasing this version though as there are open source versions out there that will accomplish the same goal.

http://javascriptcompressor.com/

This version is Dean Edward's Packer. The problem with this is that the website has a decoder. This kind of defeats the purpose of obfuscation, and I would recommend it only for compression and not obfuscation.

These tools appear to be a great way to keep proprietary JavaScript code from falling into the wrong hands. Obfuscation is not prefect or foolproof, but consider this question: Is a thief more likely to snoop around in a car with unlocked doors or one that is securely locked?

Here are some links to free or open source obfuscators. All three work from the command line:

- YUI Compressor
- ObfuscateJS JavaScript Obfuscator
- JSO (JavaScript Obfuscator)

Thursday, October 11, 2007

Nested-Nested Quotes

Larry Wakeman of Memory Pharmaceuticals recently sent me a reminder about a follow-up article that I forgot to write. Back in April, I addressed the concept of quotes within quotes within quotes and how to handle this phenomenon.

Any programmer who has ever had to write Greasemonkey scripts, as well as to perform any type of server-side or client-side programming has or will encounter this particular problem. For example, a JavaScript function that generates HTML that contains attributes that refer to JavaScript functions that take strings as parameters, will require some cleverness on the part of the developer, especially if the parameter passed into the JavaScript function is determined at the time the HTML is generated:


<html>
     <head>
     </head>
     <body>
         <script type="text/javascript">

    function alertParameter(stringParameter) {
         alert(stringParameter);
    }

    var topLevel = "test";

    function generateHTML(topLevel) {
         document.write("<a href=\"#\" onclick=\"alertParameter('"+topLevel+"')\">Click this link to see the result</a>");
    }

    generateHTML(topLevel);
        </script>

    </body>
</html>



Here is a breakdown of the above process: The text "document.write" is what I refer to as top level. This is the actual command that outputs the string of text included within the outer quotes. The outer quotes are the quotes that wrap the text inside the parenthesis of the document.write method.


document.write(" /* These quotes are the outer quotes */ ");


Note that there are two types of quotes: Single quotes and double quotes. We can use either to represent the outer quotes. For this example, I chose double quotes. The first set of inner quotes, like the outer quotes, can also be double or single quotes. However, if the inner quote is the same as the outer quote, then the inner quote must be escaped. Consider the following example:

Nested JavaScript Quotes Example 1



document.write("<div onclick=\"alert('ht')\">click me</div>");


The above example is the same as the following example:

Nested JavaScript Quotes Example 2



document.write("<div onclick='alert(\"ht\")'>click me</div>");


Also, to demonstrate that we can use single-quotes as the outer quotes, check out this example:

Nested JavaScript Quotes Example 3



document.write('<div onclick="alert(\'ht\')">click me<div>');


Take a careful look at the above three JavaScript nested quote examples. Understanding this concept is a prerequisite to understanding deeper JavaScript nested quotes.

As I mentioned, the outer quote can be either a double or single quote, and the first inner quote can be either a double or a single quote independent of the choice made for the outer quote. In Example 1, I used double quotes for both the outer and inner quote, whereas in Examples 2 and 3 I alternated between single and double quotes. For convenience, here is Example 1 again:

Nested JavaScript Quotes Example 1



document.write("<div onclick=\"alert('ht')\">click me</div>");


It may seem confusing at first as to how this is possible, until you see the output in the browser source (of the generated HTML, that is. I'll go into more detail on this later):


<div onclick="alert('hi')">click me</div>


The outer quotes don't appear in the generated HTML. Those were top level quotes. They served only to wrap what was being generated. As you can see, the escaped double-quotes render as actual quotes, and cause absolutely no conflict.

Nested JavaScript Quotes Example 4



document.write('<div onclick="alert(\'ht\')">click me</div>');


Example 4 produces the same output! Except in example 4 it is the single quotes that must be escaped, as this time single quotes represent the outer quotes.


JavaScript Inner, inner quotes



By this point, it should be quite apparent that there are several combinations of outer and inner quotes that can be combined without causing a non-terminated string literal error. However, after choosing an outer and an inner quote, choices become much more limited. Instead, from this point forward, nested quotes must alternate between single and double quotes. Below is an example:


<html>
     <head>
     </head>
     <body>
         <script type="text/javascript">

    function alertParameter(stringParameter) {
         alert(stringParameter);
    }

    var topLevel = "test";

    function generateHTML(topLevel) {
         document.write("<a href=\"#\" onclick=\"alertParameter('Say \\'Hello World\\'')\">Click this link to see the result</a>");
    }

    generateHTML(topLevel);
    </script>

    </body>
</html>


You may have noticed that the previous paragraph contains a sentence with strikethrough styling. After trying the example I was going to try, I realized that the rule that the quotes must alternate is not necessarily true. I'll explain, but first, here is the output when clicking the link:

Say 'Hello World'

Definition of literal quote: This is a quote that represents a quote as text in a string. A quote that marks the beginning or end of a string is a non-literal quote, and it is not part of the actual text. A non-literal quote can be represented as a literal quote by escaping it with a backslash (\) character.

Since the outer quote is a double quote, the inner quotes are literal quotes in the string. If the outer quotes were single quotes, then double quotes would be literal. As a result, I can place as may single quotes in a row as I want without affecting the String. However, this will affect not what occurs with the output, but what occurs with the output of the output:

(" ... onclick=\"alertParameter('Say \\'Hello World\\'')\"> ... ")

Outer quote = " (This marks the beginning and end of the string)
Inner quote = \" (Escaped as to not flag "beginning/end of string")
Third-tier quote = ' (Literal quote)
Fourth-tier quote = \\' (Literal quote that will be generated as an escaped outer quote)

This is what really makes this JavaScript nesting quote process complicated. As I mentioned, the single quote is a literal quote. This means the string treats it as a normal character. However, in the HTML -- in the output that is -- the single quote becomes an outer quote. The single quotes around 'Hello World' are literal quotes in the JavaScript code, but in the HTML, these single quotes are inner quotes that must be escaped. However, using a single \' would simply result in a '. So the trick is to escape the slash. Thus, a document.write("\\'"); would produce a \' in the HTML, and this would be resolved to a ' if processed again.

Essentially, you must take two things into consideration when dealing with escaping a quote in JavaScript. If it needs to be escaped in the inner string, then you must think one level deep. It helps to move backwards. Write your desired output first, and then go through and "wrap it" with the tools that would generate it:


<script type="text/javascript">document.write("<a href=\"#\" onclick=\"alert('Say \\'Hello World\\'');\">click here</a>");</script>


I want to dynamically generate this embedded JavaScript from the server with PHP. So I'll first verify that this works in between the body HTML tags of a PHP document. It does indeed work. Next, I'll wrap it with a PHP echo statement:


<? echo "<script type=\"text/javascript\">document.write(\"<a href=\\\"#\\\" onclick=\\\"alert('Say \\\\'Hello World\\\\'');\\\">click here</a> \");</script>'"; ?>


Escaping JavaScript quotes is recursive. After wrapping the PHP echo around the JavaScript with double quotes as outer quotes, I then had to escape the previous outer quotes. I then had to escape the inner-inner double quotes not once, but twice. Once to escape the backslash used to escape the quote, and once to escape the quote so that PHP wouldn't treat it as the end of the string. What I found most interesting is that the single quote used as the outer wrapper of the alert text was ignored during the PHP wrapping process because it is a literal quote. However, the innermost single quote had to have two backslashes added -- once to add another backslash and once to escape that backslash! Each time a wrapper is added, an escaped special character must be escaped again because during each iteration, a backslash escapes a backslash!

In summary, I'm not sure if this little exercise will be of any practical use. Perhaps it would be a good problem solving exercise during an employment interview to see how good a programmer is at dealing with complex layers of abstraction. Perhaps a situation will arise where one actually needs to generate different JavaScript functions based on different criteria. I can't think of any examples other than the Multiple HTML Reply Signatures Embedder for Gmail Greasemonkey script that I wrote, and sometimes I think that there may have been an easier way to generate a dynamic-drop down list in Gmail without having to nest quotes in JavaScript. Bottom line -- I'm not sure. However, if this has proved helpful, please share your story with me!

Sunday, October 7, 2007

Update for HTML Multiple Reply Signatures for Gmail

10/09/07 UPDATE: Due to a change made to the Gmail URL structure, the HTML Multiple Reply Signatures Extension suddenly stopped working. I've updated the URL to reflect the newest, working version.


There have been some problems with the Firefox Extension that can make inserting HTML signatures in Gmail somewhat of a hassle. However, in spite of these issues, people have been downloading and using it. To my surprise and pleasure, I've received quite a bit of positive feedback. For those of you who have given me feedback, thank you! In order to address some of the issues that you've had and to accommodate some requests I've received, I've updated the extension, which can be downloaded by clicking on the following link:

Multiple HTML Reply Signatures Extension 1.0.2

Multiple HTML Reply Signatures Extension 1.0.2.1


Multiple HTML Reply Signatures Extension 1.0.2.2


I've added multi-line functionality to the editor, so you won't have to type the HTML all on one line. In addition, this should make it much easier to paste HTML directly into the editor. This was causing the script to not function when HTML was pasted, since HTML is typically not written all on one line.

HTML Multiple Reply Signatures for Gmail Email Signature Editor

I also added an update notification to the new version. Firefox Extensions support update notifications through RDF, but since my hosting provider doesn't respond to RDF requests with "text/rdf" like an Apache server would, I was forced to create a hack that uses a regular expression to check the version number against my server. The cool thing about my custom version of update notifications is that the HTML Multiple Reply Signatures for Gmail extension will simply insert a "Click here to update" button in the tool bar.

HTML Multiple Reply Signatures for Gmail Updates

It's not the most elegant way to notify users of an update, but I absolutely hate popup windows. This tool bar button is very unobtrusive. An image or a logo would be cool, but that's not my specialty so for now, it's plain text.

I started looking into the issue of why quotes won't work in the extension. This is turning out to be a lot more complicated and will take a lot more time to resolve. The problem is that the quotes have been escaped so many times that I think the strings aren't being parsed correctly. For now, you must enter your attributes without quotes like I've done below:

<a href=http://blog.opensourceopportunities.com style=color:orange>Blog</a>

Oh, I recently discovered that ampersand codes weren't supported. That has also been fixed. You can now use &nbsp; in your HTML, if needed.

I'll continue to fix bugs in this extension. The quotes are a big one, but not easily fixed. As always, if you have any problems, questions, concerns, or suggestions, please email me and I'll respond as quickly as I can.
Google