From: Lasse Reichstein Nielsen on
Lasse Reichstein Nielsen <lrn.unread(a)gmail.com> writes:

> Thomas 'PointedEars' Lahn <PointedEars(a)web.de> writes:
>
>> Is this from json2.js? If yes, then it is not acceptable. To begin with,
>> it does not regard "\"" valid JSON even though it is.
>
> It didn't use to be valid.
> Originally, a JSON text had to be either an object or an array, but not
> a simple value.
> This was changed at some point (I'm guessing during ES5 development) so
> that the grammar on json.org and the one in the ES5 spec allow JSON text
> to be any JSON value.
> JSON2 implements the original version.

Silly me, answering before checking.
It actually does allow "\"" as valid JSON.

/L
--
Lasse Reichstein Holst Nielsen
'Javascript frameworks is a disruptive technology'

From: Garrett Smith on
On 6/15/2010 10:20 PM, Lasse Reichstein Nielsen wrote:
> Lasse Reichstein Nielsen<lrn.unread(a)gmail.com> writes:
>
>> Thomas 'PointedEars' Lahn<PointedEars(a)web.de> writes:
>>
>>> Is this from json2.js? If yes, then it is not acceptable. To begin with,
>>> it does not regard "\"" valid JSON even though it is.
>>
>> It didn't use to be valid.
>> Originally, a JSON text had to be either an object or an array, but not
>> a simple value.
>> This was changed at some point (I'm guessing during ES5 development) so
>> that the grammar on json.org and the one in the ES5 spec allow JSON text
>> to be any JSON value.
>> JSON2 implements the original version.
>

Nope.

> Silly me, answering before checking.

You've not read my posts yet, apparently. Bug count of json2.js is up to
5 and some of those bugs exist in IE's and Firefox' native implementations.

> It actually does allow "\"" as valid JSON.
>
It does; you just need to make sure that if you're passing a string
value, that you take into account ecmascript string escaping rules.

The string '"\""' in ecmascript, is equivalent to '"""'. Passed to
JSON.parse, '"""', would be unparseable, as """ appears to be a valid
JSONString followed by a quote mark. A SyntaxError would be thrown.

The backslash character and quote must be escaped in a JSONString.

The backslash in the ecmascript string must be escaped before it is
passed to JSON.parse, similarly to the way one escapes a string when
passing it to the RegExp constructor.

JSON.parse('"\\"')
- Error: the character \ may not appear unescaped in a JSONString
JSON.parse('"\\\\"')
- Successfully parses a string with the single character: \
JSON.parse('"\\""')
- Successfully parses a JSONString containing the single character "

Garrett
From: Thomas 'PointedEars' Lahn on
Lasse Reichstein Nielsen wrote:

> Thomas 'PointedEars' Lahn <PointedEars(a)web.de> writes:
>> Is this from json2.js? If yes, then it is not acceptable. To begin
>> with, it does not regard "\"" valid JSON even though it is.
>
> It didn't use to be valid.
> Originally, a JSON text had to be either an object or an array, but not
> a simple value.
> This was changed at some point (I'm guessing during ES5 development) so
> that the grammar on json.org and the one in the ES5 spec allow JSON text
> to be any JSON value.
> JSON2 implements the original version.

OK, but I don't care. A viable fallback for JSON.parse() has to accept the
same strings that JSON.parse() accepts, and only those.

>>> Number is defined as:
>>> -?\d+(?:\.\d*)?(?:[eE][+\-]?\d+)?/g
>>>
>>> But this allows numbers like "2." so it can be changed to disallow that:
>>> -?\d+(?:\.\d+)?(?:[eE][+\-]?\d+)?/g
>>
>> It would still be insufficient. You simply cannot parse a context-free
>> non-regular language using only one application of only one non-PCRE.
>
> The idea of the regexp isn't to check that the grammar is correct,

Nobody wanted to check the grammar for correctness in the first place.
We have to accept its correctness as an axiom here.

> but merely that all the tokens are valid.

Which means that the string can be produced by application of the grammar.

> It's almost enough to guarantee that a successful eval on the string
> would mean that the grammar was also correct. But only almost,
> e.g., '{"x":{"y":42}[37,"y"]}' uses only correct tokens.
>
> Still, it disallows arbitrary code execution, which I guess is the main
> reason, and it correctly handles all valid JSON.

What the heck are you talking about?


PointedEars
--
realism: HTML 4.01 Strict
evangelism: XHTML 1.0 Strict
madness: XHTML 1.1 as application/xhtml+xml
-- Bjoern Hoehrmann
From: Thomas 'PointedEars' Lahn on
Garrett Smith wrote:

> Thomas 'PointedEars' Lahn wrote:
>> Garrett Smith wrote:
>>> Thomas 'PointedEars' Lahn wrote:
>>>> Garrett Smith wrote:
>>>>> Thomas 'PointedEars' Lahn wrote:
>>>>>> Garrett Smith wrote:
>>>>>>> Meeting those goals, the result should be valuable and appreciated
>>>>>>> by many.
>>>>>> Which part of my suggestion did you not like?
>>>>> Nothing, its fine but I did not see a regexp there that tests to see
>>>>> if the string is valid JSON.
>>>> There cannot be such a regular expression in ECMAScript as it does not
>>>> support PCRE's recursive matches feature. An implementation of a
>>>> push-down automaton, a parser, is required.
>>> A parser would be too much for the FAQ.
>> Probably, although I think it could be done in not too many lines for the
>> purpose of validation.
>
> That would require more code to be downloaded and more processing to run
> it. What about mobile devices?

You are confused. What good is shorter code that is not a solution?

>>> var isValidJSON = /^[\],:{}\s]*$/.
>>> test(text.replace(/\\(?:["\\\/bfnrt]|u[0-9a-fA-F]{4})/g, '@').
>>> replace(
>>> /"[^"\\\n\r]*"|true|false|null|-?\d+(?:\.\d*)?(?:[eE][+\-]?\d+)?/g,
>>> ']').
>>> replace(/(?:^|:|,)(?:\s*\[)+/g, ''))
>>
>> Is this from json2.js? If yes, then it is not acceptable. To begin
>> with, it does not regard "\"" valid JSON even though it is.
>
> The code is from json2.js:
> http://www.json.org/json2.js

Then it must be either summarily dismissed, or updated at least as follows:

/"([^"\\]|\\.)*"|.../

because *that* is the proper way to match a double-quoted string with
optional escape sequences. Refined for JSON, it must be at least

/"([^"\\^\x00-\x1F]|\\["\\\/bfnrt]|\\u[0-9A-Fa-f]{4})*"|.../

> The character sequence "\"" is valid JSON value in ecmascript,

That is gibberish. Either it is JSON, or it is ECMAScript.

> however in ecmascript, if enclosed in a single quote string, as - '"\""' -
> the backslash would escape the double quote mark, resulting in '"""',
> which is not valid JSON.

You are confused.

"\"" is both an ES string literal and JSON for the string containing "

"\\\"" is both an ES string literal and JSON for the string containing \"

"\\"" is neither an ES string literal nor JSON.

> To pass a string value containing the character sequence "\"" to

But that was not the purpose of the JSON string.

> JSON.parse, the backslash must be escaped. Thus, you would use:
>
> var quoteMarkInJSONString = '"\\""';

Yes, but that is not how JSON is usually being put in. That is, the
escaping backslash is _not_ escaped then, and the characters that
quoteMarkInJSONString contains are

\"

and not

"

whereas only the latter was intended.

> And that works.

A JSON string *literal* may very well contain a literal backslash character,
and it may also contain a literal double quote. The expression fails to
recognize that.

> JSON.parse(quoteMarkInJSONString) == JSON.parse('"\\""') == "\""
>
> Result: string value containing the single character: ".
>
> JSON.parse(quoteMarkInJSONString) === "\""

You miss the point.

>>> Number is defined as:
>>> -?\d+(?:\.\d*)?(?:[eE][+\-]?\d+)?/g
>>>
>>> But this allows numbers like "2." so it can be changed to disallow that:
>>> -?\d+(?:\.\d+)?(?:[eE][+\-]?\d+)?/g
>>
>> It would still be insufficient. You simply cannot parse a context-free
>> non-regular language using only one application of only one non-PCRE.
>
> The goal of json2.js's JSON.parse is not to filter out values that are
> valid; it is to eliminate values that are invalid. So far, it was
> noticed to fail at that in four ways and I addressed those.

You are very confused.

>>>>> [...] The suggestion to use an object literal as the string to the
>>>>> argument to JSON.parse is not any better than using "true".
>>>>
>>>> But it is. It places further requirements on the capabilities of the
>>>> parser. An even better test would be a combination of all results of
>>>> all productions of the JSON grammar.
>>>
>>> Cases that are known to be problematic can be filtered.
>>
>> Your point being?
>
> My point is that instead of trying every possible valid grammar check,
> known bugs -- such as allowing 1. and +1 and 01, as seen in Spidermonkey
> -- could be checked.

The purpose of this was to provide a viable fallback for JSON.parse().
Both your suggestion and the one in json2.js fail to do that.

> Checking every possible input is not possible.

Yes, it is.

> [TLDR]


PointedEars
--
Prototype.js was written by people who don't know javascript for people
who don't know javascript. People who don't know javascript are not
the best source of advice on designing systems that use javascript.
-- Richard Cornford, cljs, <f806at$ail$1$8300dec7(a)news.demon.co.uk>
From: Thomas 'PointedEars' Lahn on
Thomas 'PointedEars' Lahn wrote:

> Garrett Smith wrote:
>> Thomas 'PointedEars' Lahn wrote:
>>> Garrett Smith wrote:
>>>> var isValidJSON = /^[\],:{}\s]*$/.
>>>> test(text.replace(/\\(?:["\\\/bfnrt]|u[0-9a-fA-F]{4})/g, '@').
>>>> replace(
>>>> /"[^"\\\n\r]*"|true|false|null|-?\d+(?:\.\d*)?(?:[eE][+\-]?\d+)?/g,
>>>> ']').
>>>> replace(/(?:^|:|,)(?:\s*\[)+/g, ''))
>>>
>>> Is this from json2.js? If yes, then it is not acceptable. To begin
>>> with, it does not regard "\"" valid JSON even though it is.
>>
>> The code is from json2.js:
>> http://www.json.org/json2.js
>
> Then it must be either summarily dismissed, or updated at least as
> follows:
>
> /"([^"\\]|\\.)*"|.../
>
> because *that* is the proper way to match a double-quoted string with
> optional escape sequences. Refined for JSON, it must be at least
>
> /"([^"\\^\x00-\x1F]|\\["\\\/bfnrt]|\\u[0-9A-Fa-f]{4})*"|.../
^
Typo, must be

/"([^"\\\x00-\x1F]|\\["\\\/bfnrt]|\\u[0-9A-Fa-f]{4})*"|…/


PointedEars
--
realism: HTML 4.01 Strict
evangelism: XHTML 1.0 Strict
madness: XHTML 1.1 as application/xhtml+xml
-- Bjoern Hoehrmann