Edge browser cookie store location and access - winapi

How can I programmatically enumerate and delete Edge browser's cookies?
They don't appear to be among the IE cookies in temporary internet files, and therefore seem not to be returned by the FindFirstUrlCacheEntry/FindNextUrlCacheEntry API calls.
I can see cookie files in
C:\Users\...\AppData\Local\Packages\Microsoft.MicrosoftEdge_8wekyb3d8bbwe\AC\#!001\MicrosoftEdge\Cookies
C:\Users\...\AppData\Local\Packages\Microsoft.MicrosoftEdge_8wekyb3d8bbwe\AC\#!002\MicrosoftEdge\Cookies
C:\Users\...\AppData\Local\Packages\Microsoft.MicrosoftEdge_8wekyb3d8bbwe\AC\MicrosoftEdge\Cookies
What is the distinction of the three directories? How can they be accessed and individual cookies be deleted programatically?

This isn't going to be a perfect answer, but perfect enemy of good etc.
It seems like Edge still uses at least the first two locations. I don't see any recent cookies in the last one. However, maybe that is just coincidence.
I've tried running several windows and tabs to see if different folders get used by different content processes, but I've not been able to figure out much in that department, either.
What I can tell you is the format of these files: they are "*\n" separated cookie collections. Every cookie has a number of fields, which are "\n"-separated.
Edit (2015/12/16): Just stumbled across my own answer here again, and I need to note that some of the cookie field values themselves can end with "*" in which case searching for the "*\n" delimiter will think that the cookie finishes early. No, the values are not escaped (which would make sense...). So your best bet is really to just count the number of lines, which is unfortunate. This was fixed in the first portion of this patch for Firefox, which is present in Firefox 44 and later.
The cookie fields are documented in Firefox's source code:
The cookie file format is a newline-separated-values with a "*" used as delimeter between multiple records.
Each cookie has the following fields:
name
value
host/path
flags
List item
Expiration time most significant integer
Expiration time least significant integer
Creation time most significant integer
Creation time least significant integer
At least, this seems to have been the format in IE, and the format here seems to be so similar that I would be surprised if they were materially different.
I just submitted a patch for using Firefox's existing IE cookie reading code for Edge's cookies, and that seemed to work. Here's the reviewboard review for it, and the revlink in hg.

Related

How do you RESTfully get a complicated subset of records?

I have a question about getting 'random' chunks of available content from a RESTful service, without duplicating what the client has already cached. How can I do this in a RESTful way?
I'm serving up a very large number of items (little articles with text and urls). Let's pretend it's:
/api/article/
My (software) clients want to get random chunks of what's available. There's too many to load them all onto the client. They do not have a natural order, so it's not a situation where they can just ask for the latest. Instead, there are around 6-10 attributes that the client may give to 'hint' what type of articles they'd like to see (e.g. popular, recent, trending...).
Over time the clients get more and more content, but at the server I have no idea what they have already, and because they're sent randomly, I can't just pass in the 'most recent' one they have.
I could conceivably send up the GUIDS of what's stored locally. The clients only store 50-100 locally. That's small enough to stuff into a POST variable, but not into the GET query string.
What's a clean way to design this?
Key points:
Data has no logical order
Clients must cache the content locally
Each item has a GUID
Want to avoid pulling down duplicates
You'll never be able to make this work satisfactorily if the data is truly kept in a random order (bear in mind the Dilbert RNG Effect); you need to fix the order for a particular client so that they can page through it properly. That's easy to do though; just make that particular ordering be a resource itself; at that point, you've got a natural (if possibly synthetic) ordering and can use normal paging techniques.
The main thing to watch out for is that you'll be creating a resource in response to a GET when you do the initial query: you probably should use a resource name that is a hash of the query parameters (including the client's identity if that matters) so that if someone does the same query twice in a row, they'll get the same resource (so preserving proper idempotency). You can always delete the resource after some timeout rather than requiring manual disposal…

Should the length of a URL string be limited to increase security?

I am using ColdFusion 8 and jQuery 1.7.2.
I am using CFAJAXPROXY to pass data to a CFC. Doing so creates a JSON array (argument collection) and passes it through the URL. The string can be very long, since quite a bit of data is being passed.
The site that I am working has existing code that limits the length of any URL query string to 250 characters. This is done in the application.cfm file by testing the length of the query string. If any query string is great than 250 characters, the request is aborted. The purpose of this was to ensure that hackers or other malicious code wouldn't be passed through the URL string.
Now that we are using the query string to pass JSON arrays in the URL, we discovered that the Ajax request was being aborted quite frequently.
We have many other security practices in place, such as stripping any "<>" tags from code and using CFQUERYPARAM.
My question is whether limiting the length of a URL string for the sake of security a good idea or is simply ineffective?
There is absolutely no correlation between URI length and security rather more a question of:
Limiting the amount of information that you provide to a user agent to a 'Need to know basis'. This covers things such as the type of application server you run and associated conventions, the web server you run and associated conventions and the operating system on the host machine. These are essentially things that can be considered vulnerabilities.
Reducing the impact of exploiting those vulnerabilities i.e introducing patches, ensuring correct configuration etc.
As alluded to above, at the web tier, this doesn't only cover GET's (your concern), but also POST's, PUT's, DELETE's on just about any other operation on a HTTP resource.
Moved this into an answer for Evik -
That seems (at best) completely unnecessary if the inputs are being properly sanitized. I'm sure someone clever can quickly defeat a "security by small doorway" defense, assuming that's the only defense.
OWASP has some good, sane guidelines for web security. As far as I've read, limiting the size of the url is not on the list. For more information, see: https://www.owasp.org/index.php/Top_10_2010-Main
I would also like to echo Hereblur's comment that this makes internationalization tricky, or maybe impossible.
I'm not a ColdFusion developer. But I think it's the same with other language.
I think It's help just a little bit. The problem of malicious code or sql injection should be handle by your application.
I agree that limited length of query string value is safer and add more difficult to hackers. But you cant do this with POST data. and It's limit some functionality. For example,
For one utf-8 character, It may take 9 characters after encoded. that's mean you can put only 27 non-english characters.
The only reason to limit has to do with performance and DOS attack - not security per se (though DOS is a security threat by bringing down your server). Web servers and App servers (including CF) allow you to limit the size of POST data so that your server won't be degraded by very large file uploads. URL data if substantial can result in long running requests as the server struggles to parse or handle or write or whatever.
So there is some modest risk here related to such things. Back in the NT days IIS 3 had a number of flaws that were "locked down" by limiting the length of the URL - but those days are long gone. There are far more exploits representing low hanging fruit that I would look at first before examining this issue too closely - unless of course you feel like you are hanging a specific problem with folks probing you (with long URLs I mean :).

LocalStorage, several Ajax requests or huge Ajax request?

I'm facing a really huge issue (at least for me). I'll give you some background.
I'm developing a mobile web app which will show information about bus stations and bike stations in my city. It will show GMap markers for both bus and bike stops and, if you click it, you will have info about the arrival time for the bus or how many bikes are available. I have that covered pretty nicely.
The problematic part is loading the stations.
My first approach was to load the whole amount of stations at page loading. It is about 200 Kb of JSON plus the time it gets to iterate through the array and put it in the map. They are hiddenly loaded so, when the users click on the line name, they appear using the 'findMarker' function. If other stations were present at the map, they get hidden to avoid having too many markers in the map.
This worked good with new mobiles such as iPhone 4 or brand new HTC but it doesn't perform good with 2 years old mobiles.
The second approach I thought, was to load them by request, so you click on the station, they are loaded to the map, and then shown but this leads to two problems:
The first is that you may be (or may be not) perform several requests
which may ends the same (?)
The second one is that to avoid having so
many markers, they should be hidden or deleted so you may be deleting
info that should be needed again in a while, like bike stations that
are loaded as a group and not by line.
Finally, I thought about using LocalStorage to store them, as they won't change that much but will be a huge amount of data and then, a pain in the ass to retrieve them as they are key-value, and also (and I'm not really sure about that) I could find devices with no support of this feature having to fallback to one of the other options.
So, that said, I thought that may be someone faced some similar problem and solved it in some way or have some tips for me :).
Any help would be really appreciated.
The best approach depends on the behaviour of your users. Do they typically click on a lot of stations or just a few? Do they prefer a faster startup (on-demand loading) or a more responsive station detail display (pre-loading data)?
An approach worth investigating would be to load the data by request but employ browser caching (ETag and Expires headers) to avoid retrieving the same information over and over again. This would solve both your concerns without dealing with LocalStorage.
See this this question for different approaches for browser caching ETag vs Header Expires

Can HTTP headers be too big for browsers?

I am building an AJAX application that uses both HTTP Content and HTTP Header to send and receive data. Is there a point where the data received from the HTTP Header won't be read by the browser because it is too big ? If yes, what is the limit and is it the same behaviour in all the browser ?
I know that theoretically there is no limit to the size of HTTP headers, but in practice what is the point that past that, I could have problem under certain platform, browsers or with certain software installed on the client computer or machine. I am more looking into a guide-line for safe practice of using HTTP headers. In other word, up to what extend can HTTP headers be used for transmitting additional data without having potential problem coming into the line ?
Thanks, for all the input about this question, it was very appreciated and interesting. Thomas answer got the bounty, but Jon Hanna's answer brought up a very good point about the proxy.
Short answers:
Same behaviour: No
Lowest limit found in popular browsers:
10KB per header
256 KB for all headers in one response.
Test results from MacBook running Mac OS X 10.6.4:
Biggest response successfully loaded, all data in one header:
Opera 10: 150MB
Safari 5: 20MB
IE 6 via Wine: 10MB
Chrome 5: 250KB
Firefox 3.6: 10KB
Note
Those outrageous big headers in Opera, Safari and IE took minutes to load.
Note to Chrome:
Actual limit seems to be 256KB for the whole HTTP header.
Error message appears: "Error 325 (net::ERR_RESPONSE_HEADERS_TOO_BIG): Unknown error."
Note to Firefox:
When sending data through multiple headers 100MB worked fine, just split up over 10'000 headers.
My Conclusion:
If you want to support all popular browsers 10KB per header seems to be the limit and 256KB for all headers together.
My PHP Code used to generate those responses:
<?php
ini_set('memory_limit', '1024M');
set_time_limit(90);
$header = "";
$bytes = 256000;
for($i=0;$i<$bytes;$i++) {
$header .= "1";
}
header("MyData: ".$header);
/* Firfox multiple headers
for($i=1;$i<1000;$i++) {
header("MyData".$i.": ".$header);
}*/
echo "Length of header: ".($bytes / 1024).' kilobytes';
?>
In practice, while there are rules prohibitting proxies from not passing certain headers (indeed, quite clear rules on which can be modified and even on how to inform a proxy on whether it can modify a new header added by a later standard), this only applies to "transparent" proxies, and not all proxies are transparent. In particular, some wipe headers they don't understand as a deliberate security practice.
Also, in practice some do misbehave (though things are much better than they were).
So, beyond the obvious core headers, the amount of header information you can depend on being passed from server to client is zero.
This is just one of the reasons why you should never depend on headers being used well (e.g., be prepared for the client to repeat a request for something it should have cached, or for the server to send the whole entity when you request a range), barring the obvious case of authentication headers (under the fail-to-secure principle).
Two things.
First of all, why not just run a test that gives the browser progressively larger and larger headers and wait till it hits a number that doesn't work? Just run it once in each browser. That's the most surefire way to figure this out. Even if it's not entirely comprehensive, you at least have some practical numbers to go off of, and those numbers will likely cover a huge majority of your users.
Second, I agree with everyone saying that this is a bad idea. It should not be hard to find a different solution if you are really that concerned about hitting the limit. Even if you do test on every browser, there are still firewalls, etc to worry about, and there is absolutely no way you will be able to test every combination (and I'm almost positive that no one else has done this before you). You will not be able to get a hard limit for every case.
Though in theory, this should all work out fine, there might later be that one edge case that bites you in the butt if you decide to do this.
TL;DR: This is a bad idea. Save yourself the trouble and find a real solution instead of a workaround.
Edit: Since you mention that the requests can come from several types of sources, why not just specify the source in the request header and have the data contained entirely in the body? Have some kind of Source or ClientType field in the header that specifies where the request is coming from. If it's coming from a browser, include the HTML in the body; if it's coming from a PHP application, put some PHP-specific stuff in there; etc etc. If the field is empty, don't add any extra data at all.
The RFC for HTTP/1.1 clearly does not limit the length of the headers or the body.
According to this page modern browsers (Firefox, Safari, Opera), with the exception of IE can handle very long URIs: https://web.archive.org/web/20191019132547/https://boutell.com/newfaq/misc/urllength.html. I know it is different from receiving headers, but at least shows that they can create and send huge HTTP requests (possibly unlimited length).
If there's any limit in the browsers it would be something like the size of the available memory or limit of a variable type, etc.
Theoretically, there's no limit to the amount of data that can be sent in the browser. It's almost like saying there's a limit to the amount of content that can be in the body of a web page.
If possible, try to transmit the data through the body of the document. To be on the safe side, consider splitting the data up, so that there are multiple passes for loading.

Getting ETags right

I’ve been reading a book and I have a particular question about the ETag chapter. The author says that ETags might harm performance and that you must tune them finely or disable them completely.
I already know what ETags are and understand the risks, but is it that hard to get ETags right?
I’ve just made an application that sends an ETag whose value is the MD5 hash of the response body. This is a simple solution, easy to achieve in many languages.
Is using MD5 hash of the response body as ETag wrong? If so, why?
Why the author (who obviously outsmarts me by many orders of magnitude) does not propose such a simple solution?
This last question is hard to answer unless you are the author :), so I’m trying to find the weak points of using an MD5 hash as an ETag.
ETag is similar to the Last-Modified header. It's a mechanism to determine change by the client.
An ETag needs to be a unique value representing the state and specific format of a resource (a resource could have multiple formats that each need their own ETag). Not unique across the entire domain of resources, simply within the resource.
Now, technically, an ETag has "infinite" resolution compared to a Last-Modified header. Last-Modified only changes at a granularity of 1 second, whereas an ETag can be sub second.
You can implement both ETag and Last-Modified, or simply one or the other (or none, of course). If you Last-Modified is not sufficient, then consider an ETag.
Mind, I would not set ETag for "every" resource. Basically, I wouldn't set it for anything that has no expectation of being cached (dynamic content notably). There's no point in that case, just wasted work.
Edit: I see your edit, and clarify.
MD5 is fine. The only downside is calculating MD5 all the time. Running MD5 on, say, a 200K PDF file, is expensive. Running MD5 on a resource that has no expectation of being cached is simply wasteful (i.e. dynamic content).
The trick is simply that whatever mechanism you use, it should be as cheap as Last-Modified typically is. Last-Modified is, again, typically, a property of the resource, and usually very cheap to access.
ETags should be similarly cheap. If you are using MD5, and you can cache/store the association between the resource and the MD5 hash, then that's a fine solution. However, recalculating the MD5 each time the ETag is necessary, is basically counter to the idea of using ETags to improve overall server performance.
We're using etags for our dynamic content in instela.
Our strategy is at the end of output generating the md5 hash of the content to send and if the if-none-match header exists, we compare the header with the generated hash. If the two values are the same we send 304 code and interrumpt the request without returning any content.
It's true that we consume a bit cpu to hash the content but finally we're saving much bandwidth.
We have a facebook newsfeed style main page which has different content for every user. As the newsfeed content changes only 3-4 time per hour, the main page refreshes are so efficient for the client side. In the mobile era I think it's better to spend a bit more cpu time than spending bandwidth. Bandwidth is still more expensive than the CPU, and it's a better experience for the client.
Having not read the book, I can't speak on the author's precise concerns.
However, the generation of ETags should be such that an ETag is only generated once when a page has changed. Generating an MD5 hash of a web page costs processing power and time on the server; if you have many clients connecting, it could start to cause performance problems.
Thus, you need a good technique for generating ETags only when necessary and caching them on the server until the related page changes.
I think the perceived problem with ETAGS is probably that your browser has to issue and parse a (simple and small) request / response for every resource on your page to check if the etag value has changed server side.
I personally find these extra small roundtrips to the server acceptable for often changing images, css, javascript (the server does not need to resend the content if the browser's etag is current) since the mechanism makes it quite easy to mark 'updated' content.

Resources