From: Daniel Egeberg on
On Thu, Mar 11, 2010 at 23:16, Daniel Egeberg <degeberg(a)php.net> wrote:
> On Thu, Mar 11, 2010 at 22:57, George Langley <george.langley(a)shaw.ca> wrote:
>>        Hi all. Is there an issue with $_GET not handling a Base64-encoded value correctly? (PHP is 5.1.6)
>>        Am receiving a Base64-encoded value:
>>
>> theurl.com/index.php?message=xxxxx
>>
>>  and retrieving it with $_GET:
>>
>> echo $_GET["message"];
>>
>> xxxxx is a Japanese phrase, that has been encoded into Base64. So is using the + symbol:
>>
>> ...OODq+OCou...
>>
>> but my $_GET is replacing the + with a space:
>>
>> ...OODq OCou...
>>
>> thus the base64_decode() is failing (displays diamonds with questions marks on my Mac).
>>
>>        The Base64-encoded string is 156 characters long, if that has any bearing. My test URL is 230 characters in total, less than the "old" 256 limit.
>>        All I can find online is a reference that PHP will no longer assume that a space is a +:
>>
>> <http://ca3.php.net/manual/en/function.base64-decode.php#69298>
>>
>> but my problem is the opposite - the + symbols are there, but the GET is removing them.
>>        (And to add a wrinkle, this then goes into a Joomla! page, whose getVar() command completely removes the +, so I couldn't even do a string replace, as I don't know where the + should have been!)
>>
>>        Tired of looking at the dark red spot on the wall! Thanks.
>
> PHP does a urldecode() on GET parameters, which regards + as a space.
> You should be able to get the information you need using
> $_SERVER['QUERY_STRING'].
>

And this is now made a bit clearer in the manual:
http://svn.php.net/viewvc?view=revision&revision=296092

Should propagate to the mirrors some time tomorrow.

--
Daniel Egeberg
From: George Langley on
Hi again. Thanks for all the info!
Not sure I'd agree that GET should just "assume" it was URLencoded, but hey - who am I to argue?  :-{)]
As mentioned, this is eventually buried into a Joomla! site's login functions (displays any errors). So not sure I'd have access to the originating call to URLencode it before sending, if it's part of the standard Joomla! login function and not some of our custom code.
However, Mike's suggestion to "pre-parse" it at our end:

$_GET['foo'] = str_replace(' ', '+', $_GET['foo']);

appears to work fine. I put the above in in just before the GET in my bare-bones PHP-only test, and in the actual Joomla! page just before their equivalent call:

echo base64_decode(JRequest::getVar('message', '', 'method', 'base64'));

Have tested it with strings that also included a / (being the other non-alphanumeric character that Base64 uses), and it remains unaffected.
So, guess I can either add the pre-parse wherever I need to, or try to locate the call to see if I can urlencode it (and who am I to argue why they didn't do that too?!)
Thanks again.

George
From: Ashley Sheridan on
On Thu, 2010-03-11 at 17:34 -0700, George Langley wrote:

> Hi again. Thanks for all the info!
> Not sure I'd agree that GET should just "assume" it was URLencoded, but hey - who am I to argue? :-{)]
> As mentioned, this is eventually buried into a Joomla! site's login functions (displays any errors). So not sure I'd have access to the originating call to URLencode it before sending, if it's part of the standard Joomla! login function and not some of our custom code.
> However, Mike's suggestion to "pre-parse" it at our end:
>
> $_GET['foo'] = str_replace(' ', '+', $_GET['foo']);
>
> appears to work fine. I put the above in in just before the GET in my bare-bones PHP-only test, and in the actual Joomla! page just before their equivalent call:
>
> echo base64_decode(JRequest::getVar('message', '', 'method', 'base64'));
>
> Have tested it with strings that also included a / (being the other non-alphanumeric character that Base64 uses), and it remains unaffected.
> So, guess I can either add the pre-parse wherever I need to, or try to locate the call to see if I can urlencode it (and who am I to argue why they didn't do that too?!)
> Thanks again.
>
> George


Of course GET data would be assumed to be url encoded, it's part of the
URL, what other format could it take?! :p

Thanks,
Ash
http://www.ashleysheridan.co.uk