I'm having problems on spliting a string number into single digits with Processing - processing

I'm new with processing and I'm trying to split any string digit into a single array element. Then my goal is to find home many numbers repeat themself anf print them out in an array. I'm not sure if I'm in the right track tho! I'm aware that there are some missing lines, but as I mention before I'm new and exploring the array, modulo and string area.
int[] dig = new string [1233467890];
int n=dig.length;
while(n<0){
arr[i--]=n%10
dig = n % 10;
n = n / 10;
}
println(arr);
Thanks ahead of time for help
Edwin

I think you are mixing things up a little bit here, specially what strings and arrays are.
An array is a sequence of objects, and these objects may be integers, characters, booleans, circles, cups or balls. A String is, in the programming universe, a very special type of array: it is an array of characters.
So, as you may have noticed, there's no way of creating a "string" of integers. And the processing programming interface tells you exactly that if you try to run the code you posted:
"cannot convert [] String to [] Int". That means: strings and ints are things fundamentally different.
As I understood neither your goal nor your code, I can't help you any further.
I think it would be a better idea to read and understand the following link, run and understand the more basic examples there, and only then try to program what you want.
http://processing.org/reference/Array.html
http://processing.org/reference/String.html
Best regards

Related

how does this Ruby code work? (hash) (Learnrubythehardway)

I know i will look like a total noob, but there's something I can't wrap my head around. Let me emphasize that i DID google this thing, but i didn't find what I was looking for.
I'm going through the learnrubythehardway course, and for ex39 this is one of the functions we have defined:
def Dict.hash_key(aDict, key)
return key.hash % aDict.length
end
The author gives this explanation:
hash_key
This deceptively simple function is the core of how a hash works. What it does is uses the built-in Ruby hash function to convert a
string to a number. Ruby uses this function for its own hash data
structure, and I'm just reusing it. You should fire up a Ruby console
to see how it works. Once I have a number for the key, I then use the
% (modulus) operator and the aDict.length to get a bucket where this
key can go. As you should know, the % (modulus) operator will divide
any number and give me the remainder. I can also use this as a way of
limiting giant numbers to a fixed smaller set of other numbers. If you
don't get this then use Ruby to explore it
I like this course, but the above paragraph was no help.
Ok, you call the function passing it two arguments (aDict is an array) and it returns something.
(My questions are not totally independent of one another.)
What and how does it do that? (ok, it returns a bucket index, but how do we "get there"?)
What does the key.hash do/what is it?
How does using the % help me get what I need? (What is the use of "modding" the key.hash by the aDict.length?)
"Use Ruby to explore it." - ok, but my question No.2. kinda already suggests that I wouldn't know how to go about doing that.
Thanks in advance.
key.hash is calling Object#hash, which is not to be confused with Hash.
Object#hash converts a string into a number consistently (the same string will always result in the same number, in the same running instance of Ruby).
pry(main)> "abc".hash
=> -1672853150
So now we have a number, but it's way too large for the number of buckets in our Dict structure, which defaults to 256 buckets. So we modulus it to get a number within our bucket range.
pry(main)> "abc".hash % 256
=> 98
This essentially allows us to translate Dict["abc"] into aDict[98].
RE: This example in particular
I'm going to change the order of things in a way that I hope makes more sense:
#2. You can think of a hash as a sort of 'fingerprint' of something. The .hash method will create a (generally) unique output for any given input.
#3. In this case, we know that the hash is a number, so we take the modulo of the generated number by the backing array's length in order to find a (hopefully empty) index that is within our storage's bounds.
#1. That's how. A hashing algorithm will return the same output for any given input. The modulo takes this output and turns it into something we can actually use in an array to find something reliably.
#4. Call hash on something. Call it on a string and then modulo it by the length of an array. Try again on another string. Do that again, and use your result to assign something to that array. Do it again to see that the hash and modulo thing will find that value again.
Further Notes:
By itself, the modulo function is not a good way to pick unique indexes for keys. This example is the first step, but especially in a small array, there is still a relatively large chance for the hashes of different keys to modulo into the same number. That's called a collision, and handling those seems to be outside the scope of this question.

The difference and use of strings and string arrays?

Okay, so for all i know a string is basically an array of characters. So why would there be string arrays in VB? And what differences are between them?
Just the basics, the way they operate that's what i'm interested in.
At times it is very useful to think of a String as an array of characters. It can also be useful to think of it as an array of bytes at times too - and this is of course not the same thing at all.
See The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) for better understanding of the differences between bytes and the characters held by Strings (UTF-16LE) as well as other character encodings commonly used.
But all of that aside, a String is really a higher level abstraction that you should not think of as an array of any kind.
After all, by that sort of logic an Integer or Long is an array as well.
So considering that a String is meant to be viewed as a primitive scalar value type the purpose of String arrays should be pretty clear. Arrays of Strings have pretty much the same sorts of uses as arrays of any other data type.
The fact that you have operations you can perform on Strings that root around inside them (substring operations) isn't much different conceptually than the operations that operate on the data inside any other simple type.
Say you need to store a list of names, it might be 100 names, or 200 names.. it depends from case to case.. what will u do?
String array can solve such case
Try this:
Dim Names() As String
ReDim Names(3) As String
Names(0) = "First"
Names(1) = "Second"
Names(2) = "Third"
Names(3) = "Fourth"
Dim l As Long
For l = LBound(Names) To UBound(Names)
MsgBox Names(l)
Next

Algorithm to Map Strings to Short Replacements

I'm looking at ways to deterministically replace unique strings with unique and optimally short replacements. So I have a finite set of strings, and the best compression I could achieve so far is through an enumeration algorithm, where I order the input set and then replace the strings with an enumeration of char strings over an extended alphabet (a..z, A...Z, aa...zz, aA... zZ, a0...z9, Aa..., aaa...zaa, aaA...zaaA, ....).
This works wonderfully as far as compression is concerned, but has the severe drawback that it is not atomic on any given input string. Rather, its result depends on knowing all input strings right from the start, and on the ordering of the input set.
Anybody knows of an algorithm that has similar compression but doesn't require knowing all input strings upfront?! Hashing for example would not work for me, as depending on the size of the input set I'd need a hash length of 8-12 for the hashes to be unique, and that would be too long as replacements (currently, the replacement strings are 1-3 chars long for my use cases (<10,000 input strings)). Also, if theoreticians among us know this is wasted effort, I would be interested to hear :-) .
You could use your enumeration scheme, but sorted by the order in which you first encounter the input strings.
For example, the first string you ever process can be mapped to "a".
The next distinct string would be mapped to "b", etc.
Every time you process a string, you'd need to look it up to see if it has already been mapped.
"Optimally short" depends on the population of strings from which your samples are drawn. In the absence of systematic redundancy in the population, you will find that only a fraction of arbitrary strings can be compressed at all (e.g., consider trying to compress random bit strings).
If you can make assumptions about your data, such as "the strings are expected to be mainly composed of English words" then you can do something simple and effective based on letter frequency (e.g., for English, the relative frequency order is something like ETAOINSHRDLUGCY..., so you would want to use fewer bits to represent Es and more bits to represent uncommon letters like Q).
Cheers.

Symmetric Bijective String Algorithm?

I'm looking for an algorithm that can do a one-to-one mapping of a string onto another string.
I want an algorithm that given an alphabet I can perform a symmetric mapping function.
For example:
Let's consider that I have the alphabet "A","B","C","D","E","F". I want something like F("ABC") = "CEA" and F("CEA") = "ABC" for every N letter permutation.
Surely, an algorithm like this exists. If you know of an algorithm, please post the name of it and I can research it. If I haven't been clear enough in my request, please let me know.
Thanks in advance.
Edit 1:
I should clarify that I want enough entropy so that F("ABC") would equal "CEA" and F("CEA") = "ABC" but then I do NOT want F("ABD") to equal "CEF". Notice how two input letters stayed the same and the two corresponding output letters stayed the same?
So a Caesar Cipher/ROT13 or shuffling the array would not be sufficient. However, I don't need any "real" security. Just enough entropy for the output of the function to appear random. Weak encryption algorithms welcome.
Just create an array of objects that contain 2 fields -- a letter, and a random number. Sort the array. By the random numbers. This creates a mapping where the i-th letter of the alphabet now maps to the i-th letter in the array.
If simple transposition or substitution isn't quite enough, it sounds like you want to advance to a polyalphabetic cipher. The Vigenère cipher is extremely easy to implement in code, but is still difficult to break without using a computer.
I suggest the following.
Perform a dense coding of the input to positive integers - with an alphabet size of n and string length of m you can code the string into integers between zero and n^m - 1. In your example this would be the range [0,215]. Now perform a fixed involution on the encoded number and decode it again.
Take RC4, settle for some password, and you're done. (Not that this would be very safe.)
Take the set of all permutations of your alphabet, shuffle it, and map the first half of the set onto the second half. Bad for large alphabets, of course. :)
Nah, thought that over, I forgot about character repetitions. Maybe divide the input into chunks without repeating chars and apply my suggestion to all of those chunks.
I would restate your problem thus, and give you a strategy for that restatement:
"A substitution cypher where a change in input leads to a larger change in output".
The blocking of characters is irrelevant-- in the end, it's just mappings between numbers. I'll speak of letters here, but you can extend it to any block of n characters.
One of the easiest routes for this is a rotating substitution based on input. Since you already looked at the Vigenere cipher, it should be easy to understand. Instead of making the key be static, have it be dependent on the previous letter. That is, rotate through substitutions a different amount per each input.
The variable rotation satisfies the condition of making each small change push out to a larger change. Note that the algorithm will only push changes in one direction such that changes towards the end have smaller effects. You could run the algorithm both ways (front-to-back, then back-to-front) so that every letter of cleartext changed has the possibility of changing the entire string.
The internal rotation strategy elides the need for keys, while of course losing of most of the cryptographic security. It makes sense in context, though, as you are aiming for entropy rather than security.
You can solve this problem with Format-preserving encryption.
One Java-Library can be found under https://github.com/EVGStudents/FPE.git. There you can define a Regex and encrypt/decrypt string values matching this regex.

How to elegantly compute the anagram signature of a word in ruby?

Arising out of this question, I'm looking for an elegant (ruby) way to compute the word signature suggested in this answer.
The idea suggested is to sort the letters in the word, and also run length encode repeated letters. So, for example "mississippi" first becomes "iiiimppssss", and then could be further shortened by encoding as "4impp4s".
I'm relatively new to ruby and though I could hack something together, I'm sure this is a one liner for somebody with more experience of ruby. I'd be interested to see people's approaches and improve my ruby knowledge.
edit: to clarify, performance of computing the signature doesn't much matter for my application. I'm looking to compute the signature so I can store it with each word in a large database of words (450K words), then query for words which have the same signature (i.e. all anagrams of a given word, that are actual english words). Hence the focus on space. The 'elegant' part is just to satisfy my curiosity.
The fastest way to create a sorted list of the letters is this:
"mississippi".unpack("c*").sort.pack("c*")
It is quite a bit faster than split('') and join(). For comparison it is also best to pack the array back together into a String, so you dont have to compare arrays.
I'm not much of a Ruby person either, but as I noted on the other comment this seems to work for the algorithm described.
s = "mississippi"
s.split('').sort.join.gsub(/(.)\1{2,}/) { |s| s.length.to_s + s[0,1] }
Of course, you'll want to make sure the word is lowercase, doesn't contain numbers, etc.
As requested, I'll try to explain the code. Please forgive me if I don't get all of the Ruby or reg ex terminology correct, but here goes.
I think the split/sort/join part is pretty straightforward. The interesting part for me starts at the call to gsub. This will replace a substring that matches the regular expression with the return value from the block that follows it. The reg ex finds any character and creates a backreference. That's the "(.)" part. Then, we continue the matching process using the backreference "\1" that evaluates to whatever character was found by the first part of the match. We want that character to be found a minimum of two more times for a total minimum number of occurrences of three. This is done using the quantifier "{2,}".
If a match is found, the matching substring is then passed to the next block of code as an argument thanks to the "|s|" part. Finally, we use the string equivalent of the matching substring's length and append to it whatever character makes up that substring (they should all be the same) and return the concatenated value. The returned value replaces the original matching substring. The whole process continues until nothing is left to match since it's a global substitution on the original string.
I apologize if that's confusing. As is often the case, it's easier for me to visualize the solution than to explain it clearly.
I don't see an elegant solution. You could use the split message to get the characters into an array, but then once you've sorted the list I don't see a nice linear-time concatenate primitive to get back to a string. I'm surprised.
Incidentally, run-length encoding is almost certainly a waste of time. I'd have to see some very impressive measurements before I'd think it worth considering. If you avoid run-length encoding, you can anagrammatize any string, not just a string of letters. And if you know you have only letters and are trying to save space, you can pack them 5 bits to a letter.
---Irma Vep
EDIT: the other poster found join which I missed. Nice.

Resources