From: Thomas 'PointedEars' Lahn on
Garrett Smith wrote:

> Thomas 'PointedEars' Lahn wrote:
>> Garrett Smith wrote:
> [...]
>> Don't. Let the code throw the original exception (`SyntaxError') or a
>> user-defined one (e.g., `JSONError') instead. (Unfortunately, the ES5
>> authors neglected to define an easily distinguishable exception type for
>> JSON parsing.)
>
> JSON was designed to throw an error with the least amount of information
> possible. This is, AIUI, to thwart attackers.

If so, that would be nonsense. Security by obscurity is a bad idea.

> If a fallback is made, the fallback will also throw a similar error.

But it should throw an exception. Your suggestion does not.

> In that case, it should be fairly clear that the interface performs
> equivalently across implementations.

And?

> Getting this right and getting the code short enough for the FAQ seems
> to be a challenge.

Not unless one has a serious reading problem.

> Meeting those goals, the result should be valuable and appreciated by
> many.

Which part of my suggestion did you not like?

> I'm also working on something else and that thing is wearing me out.

All the more reason to consider my suggestion.


PointedEars
--
Anyone who slaps a 'this page is best viewed with Browser X' label on
a Web page appears to be yearning for the bad old days, before the Web,
when you had very little chance of reading a document written on another
computer, another word processor, or another network. -- Tim Berners-Lee
From: Garrett Smith on
On 6/8/2010 6:27 PM, Thomas 'PointedEars' Lahn wrote:
> Garrett Smith wrote:
>
>> Thomas 'PointedEars' Lahn wrote:
>>> Garrett Smith wrote:
[...]

> If so, that would be nonsense. Security by obscurity is a bad idea.
>
>> If a fallback is made, the fallback will also throw a similar error.
>
> But it should throw an exception. Your suggestion does not.

Right - that one was no good and so it has to change.

>
>> In that case, it should be fairly clear that the interface performs
>> equivalently across implementations.
>
> And?
>
>> Getting this right and getting the code short enough for the FAQ seems
>> to be a challenge.
>
> Not unless one has a serious reading problem.
>

I did not see an expression for validJSONExp. The goal is: If a fallback
is made, the fallback will also throw a similar error.

To do that, the input must be verified against the same rules. Those
rules are defined in ECMA-262 Ed 5.

>> Meeting those goals, the result should be valuable and appreciated by
>> many.
>
> Which part of my suggestion did you not like?
>

Nothing, its fine but I did not see a regexp there that tests to see if
the string is valid JSON.

>> I'm also working on something else and that thing is wearing me out.
>
> All the more reason to consider my suggestion.

The suggestion to use `this` in global context is right. The suggestion
to use an object literal as the string to the argument to JSON.parse is
not any better than using "true". Both are valid JSONValue, as specified
in ES5.

Garrett
From: Thomas 'PointedEars' Lahn on
Garrett Smith wrote:

> Thomas 'PointedEars' Lahn wrote:
>> Garrett Smith wrote:
>>> Meeting those goals, the result should be valuable and appreciated by
>>> many.
>>
>> Which part of my suggestion did you not like?
>
> Nothing, its fine but I did not see a regexp there that tests to see if
> the string is valid JSON.

There cannot be such a regular expression in ECMAScript as it does not
support PCRE's recursive matches feature. An implementation of a push-down
automaton, a parser, is required.

> [...] The suggestion to use an object literal as the string to the
> argument to JSON.parse is not any better than using "true".

But it is. It places further requirements on the capabilities of the
parser. An even better test would be a combination of all results of all
productions of the JSON grammar.

> Both are valid JSONValue, as specified in ES5.

ACK


PointedEars
--
realism: HTML 4.01 Strict
evangelism: XHTML 1.0 Strict
madness: XHTML 1.1 as application/xhtml+xml
-- Bjoern Hoehrmann
From: Garrett Smith on
On 6/9/2010 3:37 AM, Thomas 'PointedEars' Lahn wrote:
> Garrett Smith wrote:
>
>> Thomas 'PointedEars' Lahn wrote:
>>> Garrett Smith wrote:
>>>> Meeting those goals, the result should be valuable and appreciated by
>>>> many.
>>>
>>> Which part of my suggestion did you not like?
>>
>> Nothing, its fine but I did not see a regexp there that tests to see if
>> the string is valid JSON.
>
> There cannot be such a regular expression in ECMAScript as it does not
> support PCRE's recursive matches feature. An implementation of a push-down
> automaton, a parser, is required.
>

A parser would be too much for the FAQ.

The approach on json2.js would fit nicely, but is it acceptable?

The first thing that jumped out at me was that number matching allows
trailing decimal for numbers -- something that is disallowed in JSONNumber.

json2.js has the regular expression used below. I've assigned the result
`isValidJSON` to refer to it later in this message.

var text = "...";

var isValidJSON = /^[\],:{}\s]*$/.
test(text.replace(/\\(?:["\\\/bfnrt]|u[0-9a-fA-F]{4})/g, '@').
replace(
/"[^"\\\n\r]*"|true|false|null|-?\d+(?:\.\d*)?(?:[eE][+\-]?\d+)?/g,
']').
replace(/(?:^|:|,)(?:\s*\[)+/g, ''))

Number is defined as:
-?\d+(?:\.\d*)?(?:[eE][+\-]?\d+)?/g

But this allows numbers like "2." so it can be changed to disallow that:
-?\d+(?:\.\d+)?(?:[eE][+\-]?\d+)?/g

Some implementations violate the spec by allowing the trailing decimal
(Firefox, Webkit), while others do not.

Another problem is that the json2.js strategy allows some invalid syntax
and multiple expressions to go through to eval. For example, the values
'[],{}' or ':}' or - will have a true result when used with the regex
pattern above. The first will throw a SyntaxError when passed to eval,
or Function, with a different error message than JSON.parse and the
second will actually succeed in eval, whereas with native JSON.parse, a
SyntaxError results.

JSON.parse('[],{}');
SyntaxError

eval('[],{}');
{}

An inconsistent result. What sorts of problems can that cause and how
significant are they?

>> [...] The suggestion to use an object literal as the string to the
>> argument to JSON.parse is not any better than using "true".
>
> But it is. It places further requirements on the capabilities of the
> parser. An even better test would be a combination of all results of all
> productions of the JSON grammar.
>

Cases that are known to be problematic can be filtered.

Every possibility of JSON Grammar cannot be checked unless either the
string is parsed or the code performs a set of feature tests that tests
every possibility. The possibilities would not be limited to valid JSON,
but would include every invalid JSON construct as well.

The alternative to that is to check for known cases where native JSON
fails and to check that to see how it fails and then if that failure is
unacceptable, to filter that out with a feature test.

Garrett
From: Garrett Smith on
On 6/13/2010 1:22 PM, Garrett Smith wrote:
> On 6/9/2010 3:37 AM, Thomas 'PointedEars' Lahn wrote:
>> Garrett Smith wrote:
>>
>>> Thomas 'PointedEars' Lahn wrote:
>>>> Garrett Smith wrote:

[...]

>
> JSON.parse('[],{}');
> SyntaxError
>
> eval('[],{}');
> {}
>

One way to get around that is to wrap the whole expression not in
Grouping Operator, but array literal characters '[' + text + ']'. After
evaluating that, the only valid result could be an array with length =
0; anything else should result in SyntaxError.

result = new Function("return[" + responseText + "]")();

I've also noticed that implementations allow some illegal characters.

| JSONStringCharacter ::
| SourceCharacter but not double-quote " or backslash \ or U+0000 thru
| U+001F
| \ JSONEscapeSequence

That specification is not hard to read so I'm not sure why
implementations are having such errors (especially considering they are
all working on the committee).

The specification even goes out of the way to state:

| Conforming implementations of JSON.parse and JSON.stringify must
| support the exact interchange format described in this specification
| without any deletions or extensions to the format. This differs
| from RFC 4627 which permits a JSON parser to accept non-JSON forms
| and extensions.

Didn't stop Firefox and IE from shipping buggy implementations.

The production for JSONStringCharacter states that a string character is
not any of: ", \, or \u0000-\u001f.

This quite simple to test in a regexp:

var jsonString = /"[^"\\\n\r\u0000-\u001f]*"/

jsonString.test('""'); // true
jsonString.test('"\u0000"'); // false
jsonString.test('"\u001f"'); // false
jsonString.test('"\u0020"'); // true

I also noticed that json2.js does something strange, but does not
explain where the failure case exists:

| var cx =
/[\u0000\u00ad\u0600-\u0604\u070f\u17b4\u17b5\u200c-\u200f\u2028-\u202f\u2060-\u206f\ufeff\ufff0-\uffff]/g,

(that long line will inevitably wrap and become obscured when quoted)

| // Parsing happens in four stages. In the first stage, we replace
| // certain Unicode characters with escape sequences. JavaScript
| // handles many characters incorrectly, either silently deleting
| // them, or treating them as line endings.
|
| text = String(text);
| cx.lastIndex = 0;
| if (cx.test(text)) {
| text = text.replace(cx, function (a) {
| return '\\u' +
| ('0000' + a.charCodeAt(0).toString(16)).slice(-4);
| });
| }

The code comment should provide a url that describes the problem
(bugzilla, etc).

So far, I've identified four bugs in json2.js; two of these exist in
Mozilla's and on IE's implementation.

1) JSONNumber - allows trailing decimal after integer.
2) JSONString - allows invalid characters \u0000-\u001f.
3) Evaluates expressions separated by comma.
4) Does not always throw generic syntax error as specified.

#4 seems unimportant but I addressed it anyway. #1, and #2 don't seem
that substantially bad, but if the goal is to mimic the spec, then why
not do that (and it isn't that hard).

#3 is potentially bad, but unlikely. Again, just stick with the spec.

I've addressed these issues in the code below, but there might be more.

I do not know what the purpose of allowing \uFFFF into JSON.parse, but
it is allowed by the specification.

Code:

var parseJSON = function(responseText) {

var NATIVE_JSON_PARSE_SUPPORT = typeof JSON == "object"
&& typeof JSON.parse == 'function'
&& JSON.parse('{"a":true}').a;

function isValidJSON(text) {
text = String(text);

// Based on code by Douglas Crockford, from json2.js
return !!text && /^[\],:{}\s]*$/
.test(text.replace(/\\(?:["\\\/bfnrt]|u[0-9a-fA-F]{4})/g, '@')
.replace(
/"[^"\\\n\r\u0000-\u001f]*"|true|false|null|-?\d+(?:\.\d*)?(?:[eE][+\-]?\d+)?/g,
']')
.replace(/(?:^|:|,)(?:\s*\[)+/g, ''));
}

return(parseJSON = NATIVE_JSON_PARSE_SUPPORT ?
function(responseText) {
return JSON.parse(responseText);
} : function(responseText) {
var result;
if(isValidJSON(responseText)) {
try {
result = new Function("return[" + responseText + "]")();
} catch(ex) { /*throw generic SyntaxError*/ }
if(result && result.length === 1) {
return result[0];
}
}
throw SyntaxError("JSON parse error");
})(responseText);
};

The questionable part of this strategy is relying on native support
which has been shown to be buggy. They shipped it, and it's too late, I
am sorry for that. Alternatives are to either not use JSON or to run all
input through isValidJSON function.

Garrett