Arranging header by their code point from low to high? - codepoint

I am looking to arange my headers based on their code point from low to high. Below is my attempt and I was wondering if someone could advise me on whether I have done this correctly. I basically looked up the ASCII chart (ASCII Chart)to do this manually.
Action -> X-Amz-Algorithm -> X-Amz-Credential -> X-Amz-Date -> X-Amz-SignedHeaders - > X-Amz-Signature

You need to sort the headers. Depending on the programming language, this looks different, but the key word is always "sort".
Java example:
List<String> headers = Arrays.asList("Action", "X-Amz-Algorithm", "...");
headers.sort(Comparator.naturalOrder());

Related

Decode Protobuf Text

I have some Protobuf text that I'm receiving via an http response from a website. The text roughly looks like this:
1 {
2: some value
7: {
12: some value
}
8: some value
}
except the content is much larger. I don't want to paste the actual text for security purposes.
Anyways, how can I "decode" this so that I can see the schemas?
At the moment it is impossible to obtain a perfectly accurate schema from a protobuf message.
That being said, you can get semi-close. There are some tools like protobuf-inspector that can print out a bit more information about the structure of the message.
Some important caveats about this tool (and in general) as to why it's not possible to obtain the full schema, taken from the README of the tool:
[...] the field names are obviously lost, together with some high-level details such as:
whether a varint uses zig-zag encoding or not (will assume no zig-zag by default)
whether a 32-bit/64-bit value is an integer or float (both shown by default)
signedness (auto-detect by default)

groupingBy operation in Java-8

I'm trying to re-write famous example of Spark's text classification (http://chimpler.wordpress.com/2014/06/11/classifiying-documents-using-naive-bayes-on-apache-spark-mllib/) on Java 8.
I have a problem - in this code I'm making some data preparations for getting idfs of all words in all files:
termDocsRdd.collect().stream().flatMap(doc -> doc.getTerms().stream()
.map(term -> new ImmutableMap.Builder<String, String>()
.put(doc.getName(),term)
.build())).distinct()
And I'm stuck on the groupBy operation. (I need to group this by term, so each term must be a key and the value must be a sequence of documents).
In Scala this operation looks very simple - .groupBy(_._2).
But how can I do this in Java?
I tried to write something like:
.groupingBy(term -> term, mapping((Document) d -> d.getDocNameContainsTerm(term), toList()));
but it's incorrect...
Somebody knows how to write it in Java?
Thank You very much.
If I understand you correctly, you want to do something like this:
(import static java.util.stream.Collectors.*;)
Map<Term, Set<Document>> collect = termDocsRdd.collect().stream().flatMap(
doc -> doc.getTerms().stream().map(term -> new AbstractMap.SimpleEntry<>(doc, term)))
.collect(groupingBy(Map.Entry::getValue, mapping(Map.Entry::getKey, toSet())));
The use of Map.Entry/ AbstractMap.SimpleEntry is due to the absence of a standard Pair<K,V> class in Java-8. Map.Entry implementations can fulfill this role but at the cost of having unintuitive and verbose type and method names (regarding the task of serving as Pair implementation).
If you are using the current Eclipse version (I tested with LunaSR1 20140925) with its limited type inference, you have to help the compiler a little bit:
Map<Term, Set<Document>> collect = termDocsRdd.collect().stream().flatMap(
doc -> doc.getTerms().stream().<Map.Entry<Document,Term>>map(term -> new AbstractMap.SimpleEntry<>(doc, term)))
.collect(groupingBy(Map.Entry::getValue, mapping(Map.Entry::getKey, toSet())));

How do format a date/time/number/currency in another locale?

How do i format something for another locale in Windows?
For example, in managed C# code, i would try to render a DateTime using en-US locale with:
String s = DateTime.Now.ToString(CultureInfo.CreateSpecificCulture("en-US"));
TextRenderer.DrawText(
e.Graphics, s, SystemFonts.IconTitleFont,
new Point(16, 16), SystemColors.ControlText);
And that works fine when my computer's locale is en-US:
It even works fine when my computer's locale is de-DE:
But it completely falls apart when my computer's locale is ps-AF:
Note: My sample code is in .NET, but can also be native.
Update: Attempting to set System.Threading.Thread.CurrentThread.CurrentCulture to en-US before calling DrawText:
var oldCulture = System.Threading.Thread.CurrentThread.CurrentCulture;
System.Threading.Thread.CurrentThread.CurrentCulture = CultureInfo.CreateSpecificCulture("en-US");
try
{
// String s = DateTime.Now.ToString(CultureInfo.CreateSpecificCulture("en-US"));
String s = DateTime.Now.ToString();
TextRenderer.DrawText(e.Graphics, s, SystemFonts.IconTitleFont, new Point(16, 16), SystemColors.ControlText);
}
finally
{
System.Threading.Thread.CurrentThread.CurrentCulture = oldCulture;
}
No help.
Nine, no help
Jack, no help
Eight, possible straight
King, possible flush
Ace, no help
Six, possible straight
Dave of love for the dealer
Ace bets.
Update Two:
From Michael Kaplan's blog entry:
Sometimes, GDI respects users (even if no one else does!)
GDI doesn't give a crap about formatting or really anything related to locales, with one single exception:
Digit Substitution
Any time you go to render text it will grab those digit substitution settings in the user locale (including the user override information) and use the info to decide how to display numbers.
And there is no way to override those settings at the level where GDI uses them.
i wonder how Chrome manages it. When i write digits here, in the stackoverflow question, Chrome renders them using latin digits:
0123456789
See:
What you are seeing is due to the digit substitution that occurs when your system's locale is ps-AF.
I believe that's OK -- Users of such a locale are used to seeing digits presented this way.
Normally the way this is done is slightly different, see here for example, but I don't actually think this should make any difference:
String s = DateTime.Now.ToString(new CultureInfo("en-US"));
An alternative is to set Thread.CurrentCulture to your desired locale.
I.e. do this:
Thread.CurrentCulture = new CultureInfo("en-US");
And you can then replace the first line of your code with this:
String s = DateTime.Now.ToString();
I am not quite sure, but I believe that this would solve the digit substitution issue as DrawText would now be based on the en-US culture, rather than ps-AF

Cultureinfo: how to get only languagecode

I'm developing a Windows Phone app.
How can I get the language code from CultureInfo.CurrentCulture?
I'm using CultureInfo.CurrentCulture.Name and I getting 'en-US'. I only need en.
Have you tried using the TwoLetterISOLanguageName property?
I'm not sure exactly what you are trying to achieve. If all you want is to remove the region, retaining the script distinction (if you are interested in zh-Hans for example and not just zh) then you will want to use the Parent property (). Though this can return legacy (zh-CHS) so you would want to use the IetfLanguageTag property to resolve that:
CultureInfo.CurrentCulture.Parent.IetfLanguageTag
en-US -> en
zh-CN -> zh-Hans
zh-TW -> zh-Hant
Sometimes it still isn't going to give you the expected answer since it will only language tags that are supported (but this isn't any different from the TwoLetterISOLanguageName property):
az-Cyrl-AZ -> az
az-Latn-AZ -> az
And it seems like some of the chains were omitted:
sr-Cyrl-BA -> (Invariant)
You can check for invariant and then return the TwoLetterISOLanguageName property to work around that.

Java applet - Real-time textfield input verification

I'm trying to develop an input real-time verification on a textfield in a Java applet.
The idea would be to have an input field that, if empty, once the user clicks in it it would show something like "0,00". Once the user starts to press the keys, only numbers should be accepted, and it would start to fill the text like this (imagine I input the numbers:
1,2,3,4,5,6):
"0,01" -> "0,12" -> "1,23" -> "12,34" -> "123,45" -> "1.234,56".
If the field is not empty the user can change the values but there will always be a "," dividing the decimal numbers.
I've been able to allow only numbers to be accepted but how can produce this kind of behavior? I know this may be a very specific question but any links or examples would be much appreciated. Thank you.
You will have to provide an input handler, that not only filters the input, but also calls a preset callback (made by you), that will update the required field in the way you want it to be updated.
You can use some functions, that can format numbers, given a specific format.
Basically, just keep a count on number of digits, already input, then parse it as a plain integer then multiply it by a power of 10, derived from the format, in your example would be something like 10 raised to the power of (numberOfInputDigits -2).

Resources