Search array element in a file? - ruby

I have an array from a file.txt.
I want to loop through the list and check if each element exists in myfile.txt. If the element exists, go to the next element. If it does not exist I want to add it to the not-found array.
I tried using this code:
names = ["baba", "lily", "joe", "tsaki"]
names_not_found = []
for i in 0..names.length
while line = file.gets
puts if File.open('myfile.txt').lines.any?{|line| line.include?('names') << names_not_found}
end
end
puts names_not_found
end
I'm not to sure if I'm on the right track.

I am a bit confused by the other 2 answers, as I thought you wanted to find the elements of your names Array that are not found in myfile.txt. My answer will find those names. The other solutions find lines of myfile.txt that are not equal to any of your names elements. There certainly is some misunderstanding, so my apologies if this is not what you want.
You can read the whole file into a String once, and simply use .include? (which you already use) to see which names are mentioned in it. Note this simply checks for substrings, so if the file contains a "joey" it will "find" "joe" because it's part of it. So you might want to use regular expressions with word boundaries, but I suppose that's beyond the scope somewhat.
names = ["baba", "lily", "joe", "tsaki"]
contents = File.read('myfile.txt')
names_not_found = names.reject { |name| contents.include? name }
# => ["baba"]
# contents of myfile.txt:
# hello lily
# joe
# tsaki!!
# panda

I would do as below:
csv_class_names = File.readlines("css_file")
js_file_string = File.read('js_file')
names_not_found = csv_class_names.reject { |class| js_file_string.include?(class.chomp) }
puts names_not_found

Related

regex fetching first words from a dash separated line

I have an array that contains elements in this form and shape:
{john-mary-peter-car, house-dog, mouse-mice, etc} etc so items separated by a dash and each item separated by a comma (as you would expect in an array of course). I am trying to map through the array using the regular expression to fetch only the first word of every element, (like, john,house, mouse) but all I m getting is a roster of 0.
Here is the code (I first fetch the file and shove it into the array):
linesdb = Array.new
File.open('mylist.txt').each { |line| linesdb << line }
puts( linesdb.map{|a| a=~ /^(.+?)-/})
the regex has to be right, but I think somewhere in the rest of the syntax is not ok,. The regex says, fetch at the beginning whatever there is until the dash in a non-greedy fashion, so like dont continue fetching beyond the dash.
thank you
UPDATE
The comment of Mark Thomas did have the brightest most elegant code, but he did not post it as an answer and cant give the points to it. Then there was an alternative solution, which I honored.
Yet here is my own stuff based on Mark Thomas
linesdb = Array.new
File.open('medicamentos-lista.txt').each { |line| linesdb << line.split('-').first }
puts linesdb
you could use scan like this:
File.open('mylist.txt').each { |line| linesdb << line.scan(/^(.+?)-/)[0][0] }
and then (to get the array like result):
print linesdb
or
puts linesdb
This solution is entirely based on Mark Thomas input and ALL the credit goes to him, but he put it as a comment and I cant use it to give him points.
linesdb = Array.new
File.open('medicamentos-lista.txt').each { |line| linesdb << line.split('-').first }
puts linesdb
Also I am sure the solution of Hyphenbash would have worked 100%.
With regards, to whoever has voted down my question, I give him the finger in return.

Matching English words algorithm stops working when using Ruby bang methods

I am writing a matching algorithm that checks a user-entered word against a huge list of english words to see how many matches it can find. Everything works, except I have two lines of code that are essentially meant to not pick the same letters twice, and they make the whole thing just return a single letter. Here is what I've done:
word_array = []
File.open("wordsEn.txt").each do |line|
word_array << line.chomp
end
puts "Please enter a string of characters with no spaces:"
user_string = gets.chomp.downcase
user_string_array = user_string.split("")
matching_words = []
word_array.each do |word|
one_array = word.split("")
tmp_user_string_array = user_string_array
letter_counter = 0
for i in 0...word.length
if tmp_user_string_array.include? one_array[i]
letter_counter += 1
string_index = tmp_user_string_array.index(one_array[i])
tmp_user_string_array.slice!(string_index)
end
end
if letter_counter == word.length
matching_words << word
end
end
puts matching_words
This part here is what breaks it:
string_index = tmp_user_string_array.index(one_array[i])
tmp_user_string_array.slice!(string_index)
Can anyone see an issue here? It all makes sense to me.
I see what's happening. You're eliminating letters for non-matching words, which prevents matching words from being found.
For example, take this word list:
ant
bear
cat
dog
emu
And this input to your program:
catdog
The first word you look for is ant, which causes the a and t to be sliced out of catdog, leaving cdog. Now the word cat can no longer be found.
The cure is to make sure that your tmp_user_string_array really is a temporary array. Currently it's a reference to the original user_string_array, which means that you're destructively modifying the user input. You should make a copy of it before you start slicing and dicing.
Once you've got that working, you might like to think about more efficient approaches that don't require duplicating and slicing arrays. Consider this: what if you were to sort each word of your lexicon as well as the input string before starting to look for a match? This would turn the word cat into act and the input acatdog into aacdgot. Do you see how you could traverse the sorted word and the sorted input in search of a match without the need to do any slicing?

Extract names from File using Ruby and Grep

I have a file with the following data:
other data
user1=name1
user2=name2
user3=name3
other data
to extract the names I do the following
names = File.open('resource.cfg', 'r') do |f|
f.grep(/[a-z][a-z][0-9]/)
end
which returns the following array
user1=name1
user2=name2
user3=name3
but I really want only the name part
name1
name2
name3
Right now I'm doing this after the file step:
names = names.map do |name|
name[7..9]
end
is there a better way to do? with the file step
You could do it like this, using String#scan with a regex:
Code
File.read(FNAME).scan(/(?<==)[A-Za-z]+\d+$/)
Explanation
Let's start by constructing a file:
FNAME = "my_file"
lines =<<_
other data
user1=name1
user2=name2
user3=name3
other data
_
File.write(FNAME,lines)
We can confirm the file contents:
puts File.read(FNAME)
other data
user1=name1
user2=name2
user3=name3
other data
Now run the code::
File.read(FNAME).scan(/(?<==)[A-Za-z]+\d+$/)
#=> ["name1", "name2", "name3"]
A word about the regex I used.
(?<=...)
is called a "positive lookbehind". Whatever is inserted in place of the dots must immediately precede the match, but is not part of the match (and for that reason is sometimes referred to as as "zero-length" group). We want the match to follow an equals sign, so the "positive lookbehind" is as follows:
(?<==)
This is followed by one or more letters, then one or more digits, then an end-of-line, which comprise the pattern to be matched. You could of course change this if you have different requirements, such as names being lowercase or beginning with a capital letter, a specified number of digits, and so on.
Is your code working as you have posted it?
names = File.open('resource.cfg', 'r') { |f| f.grep(/[a-z][a-z][0-9]/) }
names = names.map { |name| name[7..9] }
=> ["ame", "ame", "ame"]
You could make it into a neat little one-liner by writing it as such:
names = File.readlines('resource.cfg').grep(/=(\w*)/) { |x| x.split('=')[1].chomp }
You can do it all in a single step:
names = File.open('resource.cfg', 'r') do |f|
f.grep(/[a-z][a-z][0-9]/).map {|x| x.split('=')[1]}
end

Ruby- find strings that contain letters in an array

I've googled everywhere and can't seem to find an example of what I'm looking for. I'm trying to learn ruby and i'm writing a simple script. The user is prompted to enter letters which are loaded into an array. The script then goes through a file containing a bunch of words and pulls out the words that contain what is in the array. My problem is that it only pulls words out if they are in order of the array. For example...
characterArray = Array.new;
puts "Enter characters that the password contains";
characters = gets.chomp;
puts "Searching words containing #{characters}...";
characterArray = characters.scan(/./);
searchCharacters=characterArray[0..characterArray.size].join;
File.open("dictionary.txt").each { |line|
if line.include?(searchCharacters)
puts line;
end
}
If i was to use this code and enter "dog"
The script would return
dog
doggie
but i need the output to return words even if they're not in the same order. Like...
dog
doggie
rodge
Sorry for the sloppy code. Like i said still learning. Thanks for your help.
PS. I've also tried this...
File.open("dictionary.txt").each { |line|
if line =~ /[characterArray[0..characterArray.size]]/
puts line;
end
}
but this returns all words that contain ANY of the letters the user entered
First of all, you don't need to create characterArray yourself. When you assign result of function to a new variable, it will work without it.
In your code characters will be, for example, "asd". characterArray then will be ["a", "s", "d"]. And searchCharacters will be "asd" again. It seems you don't need this conversion.
characterArray[0..characterArray.size] is just equal to characterArray.
You can use each_char iterator to iterate through characters of string. I suggest this:
puts "Enter characters that the password contains";
characters = gets.chomp;
File.open("dictionary.txt").each { |line|
unless characters.each_char.map { |c| line.include?(c) }.include? false
puts line;
end
}
I've checked it works properly. In my code I make an array:
characters.each_char.map { |c| line.include?(c) }
Values of this array will indicate: true - character found in line, false - character not found. Length of this array equals to count of characters in characters. We will consider line good if there is no false values.

Simplest way to validate an input against a file of words?

What is the best way to validate a gets input against a very long word list (a list of all the English words available)?
I am currently playing with readlines to manipulate the text, but before there's any manipulation, I would like to first validate the entry against the list.
The simplest way, but by no means the fastest, is to simply search against the word list each time. If the word list is in an array:
if word_list.index word
#manipulate word
end
If, however, you had the word list as a separate file (with each word on a separate line), then we'll use File#foreach to find it:
if File.foreach("word.list") {|x| break x if x.chomp == word}
#manipulate word
end
Note that foreach does not strip off the trailing newline character(s), so we get rid of them with String#chomp.
Here's a simple example using a Set, though Mark Johnson is right,
a bloom filter would be more efficient.
require 'set'
WORD_RE = /\w+/
# Read in the default dictionary (from /usr/share/dict/words),
# and put all the words into a set
WORDS = Set.new(File.read('/usr/share/dict/words').scan(WORD_RE))
# read the input line by line
STDIN.each_line do |line|
# find all the words in the line that aren't contained in our dictionary
unrecognized = line.scan(WORD_RE).find_all { |term| not WORDS.include? term }
# if none were found, the line is valid
if unrecognized.empty?
puts "line is valid"
else # otherwise, the line contains some words not in our dictionary
puts "line is invalid, could not recognize #{unrecognized.inspect}"
end
end
are you reading the list from a file?
can't you have it all in memory?
maybe a finger tree may help you
if not, there's not more than "read a chunk of data from the file and grep into"
Read the word list into memory, and for each word, make an entry into a hash table:
def init_word_tester
#words = {}
File.foreach("word.list") {|word|
#words[word.chomp] = 1
}
end
now you can just check every word against your hash:
def test_word word
return #words[word]
end

Resources