Related
I am trying to write a function that returns true or false if a given string has exactly 6 consecutive characters with the same value. If the string has more or less than 6, it will return false:
I am not allowed to use lists, sets or import any packages. I am only restricted to while loops, for loops, and utilizing basic mathematical operations
Two example runs are shown below:
Enter a string: 367777776
True
Enter a string: 3677777777776
False
Note that although I entered numbers, it is actually a string within the function argument for example: consecutive('3777776')
I tried to convert the string into an ASCII table and then try and filter out the numbers there. However, I
def consecutive(x):
storage= ' '
acc=0
count=0
for s in x:
storage+= str(ord(s)) + ' '
acc+=ord(s)
if acc == acc:
count+=1
for s in x-1:
return count
My intention is to compare the previous character's ASCII code to the current character's ASCII code in the string. If the ASCII doesnt match, I will add an accumulator for it. The accumulator will list the number of duplicates. From there, I will implement an if-else statement to see if it is greater or less than 6 However, I have a hard time translating my thoughts into python code.
Can anyone assist me?
That's a pretty good start!
A few comments:
Variables storage and acc play the same role, and are a little more complicated than they have to be. All you want to know when you arrive at character s is whether or not s is identical to the previous character. So, you only need to store the previously seen character.
Condition acc == acc is always going to be True. I think you meant acc == s?
When you encounter an identical character, you correctly increase the count with count += 1. However, when we change characters, you should reset the count.
With these comments in mind, I fixed your code, then blanked out a few parts for you to fill. I've also renamed storage and acc to previous_char which I think is more explicit.
def has_6_consecutive(x):
previous_char = None
count = 0
for s in x:
if s == previous_char:
???
elif count == 6:
???
else:
???
previous_char = ???
???
You could use recursion. Loop over all the characters and for each one check to see of the next 6 are identical. If so, return true. If you get to the end of the array (or even within 6 characters of the end), return false.
For more info on recursion, check this out: https://www.programiz.com/python-programming/recursion
would something like this be allowed?
def consecF(n):
consec = 1
prev = n[0]
for i in n:
if i==prev:
consec+=1
else:
consec=1
if consec == 6:
return True
prev = i
return False
n = "12111123333221"
print(consecF(n))
You can try a two pointer approach, where the left pointer is fixed at the first instance of some digit and the right one is shifted as long as the digit is seen.
def consecutive(x):
left = 0
while left != len(x):
right = left
while right < len(x) and x[right] == x[left]:
right += 1
length = (right - 1) - left + 1 # from left to right - 1 inclusive, x[left] repeated
if length == 6: # found desired length
return True
left = right
return False # no segment found
tests = [
'3677777777776',
'367777776'
]
for test in tests:
print(f"{test}: {consecutive(test)}")
Output
3677777777776: False
367777776: True
You should store the current sequence of repeated chars.
def consecutive(x):
sequencechar = ' '
repetitions = 0
for ch in x:
if ch != sequencechar:
if repetitions == 6:
break
sequencechar = ch
repetitions = 1
else:
repetitions += 1
return repetitions == 6
If I could, I would not have given the entire solution, but this still is a simple problem. However one has to take care of some points.
As you see the current sequence is stored, and when the sequence is ended and a new starts, on having found a correct sequence it breaks out of the for loop.
Also after the for loop ends normally, the last sequence is checked (which was not done in the loop).
This is a homework assignment that I've been working on to compute if a credit card number is valid. It has many steps and uses 2 other helper functions.
The first helper function makes a list consisting of each digit in n:
def intToList(n):
strr = [num for num in str(n)]
theList = list(map(int, strr))
return theList
The second helper function adds the sum of digits in a number. For example:
def addDigits(n):
sums = 0
while n:
if n > 0:
sums += n % 10
n //= 10
else:
return
return sums
>>>(332) #(3+3+2) = 7
>>> 7
So the function I am working on is suppose to validate a 16 digit credit card number. It has specific orders to follow in the order given.
Verifies that it contains only digits. #Done.
Verifies that it is 16 digits long. #Done.
if n is a string, it converts it to an integer.
creates a list using the function intToList(n).
Multiplies the odd indices of the list made by intToList(n) by 2 and any products that produce two-digit numbers are replaced by the sum of the digits using the function addDigits(n).
Computes the sum of all the single digits in the list made my intToList(n). If the sum is equal to 0 modulo 10, the original value, n, is a valid credit card number.
As of right now I have this:
def checkCreditCard(n):
#Suppose to convert n to int.
n = int(n)
#Helper function 1 to make a list.
myList = intToList(n)
#For loop to apply the math to each odd indices.*
for ele in myList:
if ele % 2 == 1:
ele *= 2
if ele >= 10:
single = addDigits(?) #not sure what to put I've tried everything
if sum(myList) % 10 == 0:
return True
return False
Here is my issue, I am unsure where to go from here. I am pretty sure the code above is correct so far, but I don't know how to make the products that produce two-digit numbers compute to single digit ones using my function and computes the sum of all the single digits in the list.
Any help would be greatly appreciated. Let me know if I can clear anything up.
added what I've worked on.
Simple trick: The sum of the digits of all numbers from 10 to 18 (the possible two digit values for doubling or adding single digit values) can be computed simply by subtracting 9. So if you have a possible single, possibly double digit value, you can use it as a single digit with:
singledigit = maybetwodigit - 9 * (maybetwodigit >= 10)
For the record, your code as written is not correct:
def checkCreditCard(n):
#My checks for length and digits.
if len(str(n)) == 16 and str(n).isdigit():
return True
else:
return False
# nothing at this line or below will ever execute, because both your if
# and else conditions return
Also, your (currently unused) loop will never work, because you don't assign what you've calculated. You probably want something like this:
for i, ele in enumerate(myList):
if i % 2 == 1:
ele *= 2
myList[i] = ele - 9 * (ele >= 10) # Seamlessly sum digits of two digit nums
I have a forvalues loop:
forvalues x = 1(1)50 {
/* Code goes here */
}
Instead of 50, ideally, I would like that value to come as follows. I have a variable name. Let length = length(name). Whatever the largest value is for length, I would like that to be in place of the 50. I could not figure how to write a forvalues loop in which the end point was not directly stated numerically.
I am thinking that I could deduce the maximum length of the variable as follows:
gen id = 1
gen length = length(name)
by id, sort: egen maxlength = max(length)
From there though I do not know how to store this value into the for loop.
Alternatively, would this be better coded by a while loop?
Something like:
gen x = 1
while (x <= maxlength) {
/* Same Code Here */
replace x = x + 1
}
Based on the documentation I've read, it is possible to use macros but with the caveat that changing the end of the range within the forvalues loop has no effect on the number of times the loop will occur. For instance, if length(name) is 50 when the forvalues loop starts, and you change the length of name within the loop, it will still only loop 50 times.
Technically, you'd be better off using a while loop since forvalues was intended to be used when the end of the range is a literal value. You can use a forvalues loop, but you should use a while loop.
Here's my source to back this up:
http://www.stata.com/manuals13/pforvalues.pdf
Specifically:
Technical note
It is not legal syntax to type
. scalar x = 3
. forvalues i = 1(1)x' {
2. local x =x' + 1
3. display `i'
4. }
forvalues requires literal numbers. Using macros, as shown in the following technical note, is
allowed.
And:
Using macros, as shown in the following technical note, is
allowed.
Technical note
The values of the loop bounds are determined once and for all the first time the loop is executed.
Changing the loop bounds will have no effect. For instance,
will not create an infinite loop. With `n' originally equal to 3, the loop will be performed three
times.
local n 3
forvalues i = 1(1)`n' {
local n = `n' + 1
display `i'
}
Output:
1
2
3
Here is the trick with Stata which I think may work for you. I am using the data auto from Stata datasets.
sysuse auto
Suppose the variable name here be price. Now you want the length of variable price.
sum price
gen length=r(N)
To see what is r(N) type return list after running the sum price.
In your loop it goes like follows: (Updated as per #Nick)
forvalues x = 1/`r(N)'{
/* Code goes here */
}
OR:
local length=r(N)
forvalue i=1/`length'{
dis "`i'"
}
Note: It is not clear why you want for loop.So my answer is restricted to what you only asked for.
#Metrics' first code won't quite work. Here is a better way, cutting out what I call the middle macro.
Start with something more like
. su price, meanonly
. forval j = 1/`r(N)' {
An equivalent approach to the one proposed by #Nick and #Metrics is the following:
sysuse auto, clear
count if !missing(price)
forvalues x = 1 / `r(N)' {
/* Code goes here */
}
Not a homework question, but a possible interview question...
Given an array of integers, write an algorithm that will check if the sum of any two is zero.
What is the Big O of this solution?
Looking for non brute force methods
Use a lookup table: Scan through the array, inserting all positive values into the table. If you encounter a negative value of the same magnitude (which you can easily lookup in the table); the sum of them will be zero. The lookup table can be a hashtable to conserve memory.
This solution should be O(N).
Pseudo code:
var table = new HashSet<int>();
var array = // your int array
foreach(int n in array)
{
if ( !table.Contains(n) )
table.Add(n);
if ( table.Contains(n*-1) )
// You found it.;
}
The hashtable solution others have mentioned is usually O(n), but it can also degenerate to O(n^2) in theory.
Here's a Theta(n log n) solution that never degenerates:
Sort the array (optimal quicksort, heap sort, merge sort are all Theta(n log n))
for i = 1, array.len - 1
binary search for -array[i] in i+1, array.len
If your binary search ever returns true, then you can stop the algorithm and you have a solution.
An O(n log n) solution (i.e., the sort) would be to sort all the data values then run a pointer from lowest to highest at the same time you run a pointer from highest to lowest:
def findmatch(array n):
lo = first_index_of(n)
hi = last_index_of(n)
while true:
if lo >= hi: # Catch where pointers have met.
return false
if n[lo] = -n[hi]: # Catch the match.
return true
if sign(n[lo]) = sign(n[hi]): # Catch where pointers are now same sign.
return false
if -n[lo] > n[hi]: # Move relevant pointer.
lo = lo + 1
else:
hi = hi - 1
An O(n) time complexity solution is to maintain an array of all values met:
def findmatch(array n):
maxval = maximum_value_in(n) # This is O(n).
array b = new array(0..maxval) # This is O(1).
zero_all(b) # This is O(n).
for i in index(n): # This is O(n).
if n[i] = 0:
if b[0] = 1:
return true
b[0] = 1
nextfor
if n[i] < 0:
if -n[i] <= maxval:
if b[-n[i]] = 1:
return true;
b[-n[i]] = -1
nextfor
if b[n[i]] = -1:
return true;
b[n[i]] = 1
This works by simply maintaining a sign for a given magnitude, every possible magnitude between 0 and the maximum value.
So, if at any point we find -12, we set b[12] to -1. Then later, if we find 12, we know we have a pair. Same for finding the positive first except we set the sign to 1. If we find two -12's in a row, that still sets b[12] to -1, waiting for a 12 to offset it.
The only special cases in this code are:
0 is treated specially since we need to detect it despite its somewhat strange properties in this algorithm (I treat it specially so as to not complicate the positive and negative cases).
low negative values whose magnitude is higher than the highest positive value can be safely ignored since no match is possible.
As with most tricky "minimise-time-complexity" algorithms, this one has a trade-off in that it may have a higher space complexity (such as when there's only one element in the array that happens to be positive two billion).
In that case, you would probably revert to the sorting O(n log n) solution but, if you know the limits up front (say if you're restricting the integers to the range [-100,100]), this can be a powerful optimisation.
In retrospect, perhaps a cleaner-looking solution may have been:
def findmatch(array num):
# Array empty means no match possible.
if num.size = 0:
return false
# Find biggest value, no match possible if empty.
max_positive = num[0]
for i = 1 to num.size - 1:
if num[i] > max_positive:
max_positive = num[i]
if max_positive < 0:
return false
# Create and init array of positives.
array found = new array[max_positive+1]
for i = 1 to found.size - 1:
found[i] = false
zero_found = false
# Check every value.
for i = 0 to num.size - 1:
# More than one zero means match is found.
if num[i] = 0:
if zero_found:
return true
zero_found = true
# Otherwise store fact that you found positive.
if num[i] > 0:
found[num[i]] = true
# Check every value again.
for i = 0 to num.size - 1:
# If negative and within positive range and positive was found, it's a match.
if num[i] < 0 and -num[i] <= max_positive:
if found[-num[i]]:
return true
# No matches found, return false.
return false
This makes one full pass and a partial pass (or full on no match) whereas the original made the partial pass only but I think it's easier to read and only needs one bit per number (positive found or not found) rather than two (none, positive or negative found). In any case, it's still very much O(n) time complexity.
I think IVlad's answer is probably what you're after, but here's a slightly more off the wall approach.
If the integers are likely to be small and memory is not a constraint, then you can use a BitArray collection. This is a .NET class in System.Collections, though Microsoft's C++ has a bitset equivalent.
The BitArray class allocates a lump of memory, and fills it with zeroes. You can then 'get' and 'set' bits at a designated index, so you could call myBitArray.Set(18, true), which would set the bit at index 18 in the memory block (which then reads something like 00000000, 00000000, 00100000). The operation to set a bit is an O(1) operation.
So, assuming a 32 bit integer scope, and 1Gb of spare memory, you could do the following approach:
BitArray myPositives = new BitArray(int.MaxValue);
BitArray myNegatives = new BitArray(int.MaxValue);
bool pairIsFound = false;
for each (int testValue in arrayOfIntegers)
{
if (testValue < 0)
{
// -ve number - have we seen the +ve yet?
if (myPositives.get(-testValue))
{
pairIsFound = true;
break;
}
// Not seen the +ve, so log that we've seen the -ve.
myNegatives.set(-testValue, true);
}
else
{
// +ve number (inc. zero). Have we seen the -ve yet?
if (myNegatives.get(testValue))
{
pairIsFound = true;
break;
}
// Not seen the -ve, so log that we've seen the +ve.
myPositives.set(testValue, true);
if (testValue == 0)
{
myNegatives.set(0, true);
}
}
}
// query setting of pairIsFound to see if a pair totals to zero.
Now I'm no statistician, but I think this is an O(n) algorithm. There is no sorting required, and the longest duration scenario is when no pairs exist and the whole integer array is iterated through.
Well - it's different, but I think it's the fastest solution posted so far.
Comments?
Maybe stick each number in a hash table, and if you see a negative one check for a collision? O(n). Are you sure the question isn't to find if ANY sum of elements in the array is equal to 0?
Given a sorted array you can find number pairs (-n and +n) by using two pointers:
the first pointer moves forward (over the negative numbers),
the second pointer moves backwards (over the positive numbers),
depending on the values the pointers point at you move one of the pointers (the one where the absolute value is larger)
you stop as soon as the pointers meet or one passed 0
same values (one negative, one possitive or both null) are a match.
Now, this is O(n), but sorting (if neccessary) is O(n*log(n)).
EDIT: example code (C#)
// sorted array
var numbers = new[]
{
-5, -3, -1, 0, 0, 0, 1, 2, 4, 5, 7, 10 , 12
};
var npointer = 0; // pointer to negative numbers
var ppointer = numbers.Length - 1; // pointer to positive numbers
while( npointer < ppointer )
{
var nnumber = numbers[npointer];
var pnumber = numbers[ppointer];
// each pointer scans only its number range (neg or pos)
if( nnumber > 0 || pnumber < 0 )
{
break;
}
// Do we have a match?
if( nnumber + pnumber == 0 )
{
Debug.WriteLine( nnumber + " + " + pnumber );
}
// Adjust one pointer
if( -nnumber > pnumber )
{
npointer++;
}
else
{
ppointer--;
}
}
Interesting: we have 0, 0, 0 in the array. The algorithm will output two pairs. But in fact there are three pairs ... we need more specification what exactly should be output.
Here's a nice mathematical way to do it: Keep in mind all prime numbers (i.e. construct an array prime[0 .. max(array)], where n is the length of the input array, so that prime[i] stands for the i-th prime.
counter = 1
for i in inputarray:
if (i >= 0):
counter = counter * prime[i]
for i in inputarray:
if (i <= 0):
if (counter % prime[-i] == 0):
return "found"
return "not found"
However, the problem when it comes to implementation is that storing/multiplying prime numbers is in a traditional model just O(1), but if the array (i.e. n) is large enough, this model is inapropriate.
However, it is a theoretic algorithm that does the job.
Here's a slight variation on IVlad's solution which I think is conceptually simpler, and also n log n but with fewer comparisons. The general idea is to start on both ends of the sorted array, and march the indices towards each other. At each step, only move the index whose array value is further from 0 -- in only Theta(n) comparisons, you'll know the answer.
sort the array (n log n)
loop, starting with i=0, j=n-1
if a[i] == -a[j], then stop:
if a[i] != 0 or i != j, report success, else failure
if i >= j, then stop: report failure
if abs(a[i]) > abs(a[j]) then i++ else j--
(Yeah, probably a bunch of corner cases in here I didn't think about. You can thank that pint of homebrew for that.)
e.g.,
[ -4, -3, -1, 0, 1, 2 ] notes:
^i ^j a[i]!=a[j], i<j, abs(a[i])>abs(a[j])
^i ^j a[i]!=a[j], i<j, abs(a[i])>abs(a[j])
^i ^j a[i]!=a[j], i<j, abs(a[i])<abs(a[j])
^i ^j a[i]==a[j] -> done
The sum of two integers can only be zero if one is the negative of the other, like 7 and -7, or 2 and -2.
I have implemented a sorting algorithm for a custom string that represents either time or distance data for track & field events. Below is the format
'10:03.00 - Either ten minutes and three seconds or 10 feet, three inches
The result of the sort is that for field events, the longest throw or jump would be the first element while for running events, the fastest time would be first. Below is the code I am currently using for field events. I didn't post the running_event_sort since it is the same logic with the greater than/less than swapped. While it works, it just seems overly complex and needs to be refactored. I am open to suggestions. Any help would be great.
event_participants.sort!{ |a, b| Participant.field_event_sort(a, b) }
class Participant
def self.field_event_sort(a, b)
a_parts = a.time_distance.scan(/'([\d]*):([\d]*).([\d]*)/)
b_parts = b.time_distance.scan(/'([\d]*):([\d]*).([\d]*)/)
if(a_parts.empty? || b_parts.empty?)
0
elsif a_parts[0][0] == b_parts[0][0]
if a_parts[0][1] == b_parts[0][1]
if a_parts[0][2] > b_parts[0][2]
-1
elsif a_parts[0][2] < b_parts[0][2]
1
else
0
end
elsif a_parts[0][1] > b_parts[0][1]
-1
else
1
end
elsif a_parts[0][0] > b_parts[0][0]
-1
else
1
end
end
end
This is a situation where #sort_by could simplify your code enormously:
event_participants = event_participants.sort_by do |s|
if s =~ /'(\d+):(\d+)\.(\d+)/
[ $1, $2, $3 ].map { |digits| digits.to_i }
else
[]
end
end.reverse
Here, I parse the relevant times into an array of integers, and use those as a sorting key for the data. Array comparisons are done entry by entry, with the first being the most significant, so this works well.
One thing you don't do is convert the digits to integers, which you most likely want to do. Otherwise, you'll have issues with "100" < "2" #=> true. This is why I added the #map step.
Also, in your regex, the square brackets around \d are unnecessary, though you do want to escape the period so it doesn't match all characters.
One way the code I gave doesn't match the code you gave is in the situation where a line doesn't contain any distances. Your code will compare them as equal to surrounding lines (which may get you into trouble if the sorting algorithm assumes equality is transitive. That is a == b, b == c implies a ==c, which is not the case for your code : for example a = "'10:00.1", b = "frog", c="'9:99:9").
#sort_by sorts in ascending order, so the call to #reverse will change it into descending order. #sort_by also has the advantage of only parsing out the comparison values once, whereas your algorithm will have to parse each line for every comparison.
Instead of implementing the sort like this, maybe you should have a TrackTime and FieldDistance models. They don't necessarily need to be persisted - the Participant
model could create them from time_distance when it is loaded.
You're probably going to want to be able to get the difference between two values, validate values as well sort values in the future. The model would make it easy to add these features. Also it would make unit testing a lot easier.
I'd also separate time and distance into two separate fields. Having dual purpose columns in the database only causes pain down the line in my experience.
I don't know ruby but here's some c-like pseudo code that refactors this a bit.
/// In c, I would probably shorten this with the ? operator.
int compareIntegers(a, b) {
int result = 0;
if (a < b) {
result = -1;
} else if (a > b) {
result = 1;
}
return result;
}
int compareValues(a, b) {
int result = 0;
if (!/* check for empty*/) {
int majorA = /* part before first colon */
int majorB = /* part before first colon */
int minorA = /* part after first colon */
int minorB = /* part after first colon */
/// In c, I would probably shorten this with the ? operator.
result = compareIntegers(majorA, majorB);
if (result == 0) {
result = compareIntegers(minorA, minorB);
}
}
return result;
}
Your routine looks fine but you could just remove the ''', ':' and '.' and treat the result as a numeric string. In other words the 10' 5" would become 1005 and 10' 4" would be 1004. 1005 is clearly more than 1004.
Since the higer order elements are on the left, it will sort naturally. This also works with time for the same reasons.
I agree that converting to integers will make is simpler. Also note that for integers
if a > b
1
elsif a < b
-1
else
0
can be simplified to a<=>b. To get the reverse use -(a <=> b).
In this scenario:
Since you know you are working with feet, inches, and (whatever your third unit of measure is), why not just create a total sum of the two values you are comparing?
So after these two lines:
a_parts = a.time_distance.scan(/'([\d]):([\d]).([\d])/)
b_parts = b.time_distance.scan(/'([\d]):([\d]).([\d])/)
Generate the total distance for a_parts and b_parts:
totalDistanceA = a_parts[0][0].to_i * 12 + a_parts[0][1].to_i + b_parts[0][2].to_i * (whatever your third unit of measure factor against the size of an inch)
totalDistanceB = b_parts[0][0].to_i * 12 + b_parts[0][1].to_i + b_parts[0][2].to_i * (whatever your third unit of measure factor against the size of an inch)
Then return the comparison of these two values:
totalDistanceA <=> totalDistanceB
Note that you should keep the validation you are already making that checks if a_parts and b_parts are empty or not:
a_parts.empty? || b_parts.empty?
For doing the sorting by time scenario, do the exact same thing except with different factors (for example, 60 seconds to a min).
Why not do
a_val = a_parts[0][0].to_i * 10000 + a_parts[0][1].to_i * 100 + a_parts[0][2].to_i
b_val = b_parts[0][0].to_i * 10000 + b_parts[0][1].to_i * 100 + b_parts[0][2].to_i
a_val <=> b_val
The numbers won't make sense to subtract, etc but they should sort ok.
You may want to check [1] and [2] are always two digits in the regexp.