In Python How to remove duplicates from a list/string using only for loop, if else statements and without using empty list, set function - drop-duplicates

I want to remove duplicates from a list by only using for and if else statements,
Without using set, enumerate functions, fromkeys and without using new empty array/list.
So in the given list, I wish to keep an item only if it appears one time, if it occurs to second time, I wish to delete the second appearance, third appearance
a = ['aa', 'bb', 'cc', 'aa', 'dd', 'ee', 'aa', 'cc', 'ff']
So here I wish to remove 'aa' on the 3rd index and 6th index and 'cc' on 7th index while keeping the rest list as it is.

Related

How to add an array formula for updates

I'm working on trying to add an array formula so that the status column updates as information is entered.
This is an order tracking sheet. When an order is entered get set to Available, Once a driver is assigned change the status to Dispatched. As times are entered in the In and Out columns to change from Picked to Delivered. Finally once its checked off as Billed to mark the status as Complete.
So far I've only gotten to
=ARRAYFORMULA(IF(LEN(D3:D&H3:H&I3:I&M3:M&N3:N)=0, "Available", IF(D3:D<>"", "Dispatched")))
I haven't been able to figure out past that.
https://docs.google.com/spreadsheets/d/13tOkLwtPYyWm9rUfkCidqwUygzFfVfmIzV-EwL5i7hM/edit?usp=sharing
I wasn't sure which column was used to mark billed, so I just guess column P, just change it to the proper column. Enter this formula in row 1 of your status column.
={"Status";
ARRAYFORMULA(
IFERROR(
IFS(
P2:P<>"", "Complete",
J2:J<>"", "Delivered",
I2:I<>"", "Picked",
E2:E<>"", "Dispatched"
),
"")
)}
Explanation
Starting inside and working out:
The ranges J2:J, etc, tell the array formula which column to evaluate as it goes row by row, starting at row 2, and ending at the bottom of the sheet.
The IFS formula checks condition:result pairs, If the first condition is met, it displays that value, but if not, it goes to the next one, and so on until it finds a condition that is true or reaches the end.
The IFS formula will return an error if it doesn't find any of the conditions to be true, so the IFERROR says to leave the cell blank if there are no true conditions.
The ARRAYFORMULA evaluates each row for the defined range.
The { xxxx ; xxxxxxx}stacks the items after the semicolon below the item before the semicolon. In this usage, the label "Status" is in row 1, and the results are in the rows below it. This ensures that no one accidentally sorts or erases the formula.

Python 3.6 How do you efficiently compare 2 unordered lists of strings, preserve duplicates, and resolve for unequal list sizes?

In my case, I have two lists of varying lengths. The first requires at least one entry of each in the second for success, duplicates included must be matched entry for entry. For instance:
['I', 'have', 'a', 'list', 'with', 'a', 'duplicate'] and ['I', 'have', 'a', 'list', 'with', 'no', 'duplicate'] would return False
and
['I', 'have', 'a', 'list', 'with', 'a', 'duplicate'] and ['I', 'have', 'a', 'list', 'with', 'a', 'possible', 'duplicate'] would return True
I have tried with sets, but that doesn't preserve duplicates. I have also tried to use for in loops to pass through it, but they have proven too slow for the number of comparisons needing to be completed at runtime.
There must be a more efficient way to do this. Who can help me?
------------------------------------------------------------EDIT------------------------------------------------------------
This was originally asked with the intention of finding the most efficient way of solving the Ransom Note challenge on HackerRank.
If I have understood your question correctly, all you need to do is loop over the first list, checking whether the corresponding entry in the second list matches, and making sure to skip any words in the second list that do not match. If you manage to exhaust the first list this way, you know the lists match by your criteria.
If your previous solution was too slow, I imagine it's because you were using a slower algorithm. It's hard to tell, since you haven't described the sample space in any way (if it's massive, you might need to dip into a low-level language, or use specialised libraries like numpy).
This is not the prettiest implementation one could imagine, but I'll leave it here for now, until I have time to polish it. Apologies for the sloppy naming and poor style.
first = ['I', 'have', 'a', 'list', 'with', 'a', 'duplicate'], ['I', 'have', 'a', 'list', 'with', 'no', 'duplicate']
second = ['I', 'have', 'a', 'list', 'with', 'a', 'duplicate'], ['I', 'have', 'a', 'list', 'with', 'a', 'possible', 'duplicate']
# TODO: Come up with Haskell solution
# TODO: Parallelise
# TODO: Suggest numpy solution
def match(a, b):
c = iter(b)
# Loop over the first list
for word in a:
try:
# Checking whether there's a matching entry in the other list,
# making sure to skip past non-matches
while word != next(c):
pass
except StopIteration:
# We've run out of entries in the second list,
# before exhausting the first one.
return False
else:
# We've matched all entries in the first list
return True
return False
assert not match(first[0], first[1])
assert match(second[0], second[1])
I think I may have found the answer, from piecing together other things I've been looking at (somewhere around 20 different answers to similar questions missing critical pieces and needing to be put together).
The Counter from collections seems to be the way to go about this, using the - comparator and checking for an empty dict in the result. A contrived situation that illustrates is the interview question of "How do you tell if a magazine (list) contains every word you wish to have in a ransom note (list)?"
The most efficient answer I can think of, and piece together, as mentioned above is:
from collections import Counter
def ransom_note(magazine, ransom):
return (Counter(ransom) - Counter(magazine)) == {}
If anyone has something more efficient, please let me know.
Thanks,
Spiznak

When using two filters in a cell array, an error is raised when a filter has no matches

I want to look through two small ranges, on another, sheet for rows with a certain string, then remove a column from the results and print them out in a list.
Here is my code:
={FILTER( FILTER(General!A6:D, SEARCH("Kevin's Credit",General!D6:D)), {TRUE, TRUE, TRUE, FALSE}); FILTER( FILTER(General!K6:N, SEARCH("Kevin's Credit",General!N6:N)), {TRUE, TRUE, TRUE, FALSE})}
If both ranges have the string at least once it will print out all rows that have it. If even one range doesn't have it, neither print and it shows an error;
Error in ARRAY_LITERAL, an Array Literal was missing values for one or more rows.
What do I need to do to fix this?
The problem with a construction such as
={filter(A:B, B:B > 5); {true, false}}
is that when there are no rows matching the criterion B:B > 5, the array evaluates to
={#N/A; {true, false}}
where #N/A occupies a single cell. This is a malformed array, with one cell in the 1st row and two in the 2nd row.
You can get around this by replacing a single #N/A with several ones, to have the appropriate number of columns in any event.
={iferror(filter(A:B, B:B > 5), {na(), na()}; {true, false}}
Of course, something other than na() can be used here.
Also, in your specific case you don't need the row with true-false values and the outer filter by them. Just don't include columns D or N in the first place:
={filter(General!A6:C, search("Kevin's Credit", General!D6:D)); filter(General!K6:M, search("Kevin's Credit", General!N6:N))}
or, in the #N/A-safe form,
={iferror(
filter(General!A6:C, search("Kevin's Credit", General!D6:D)),
{na(), na(), na()});
iferror(
filter(General!K6:M, search("Kevin's Credit", General!N6:N)),
{na(), na(), na()})
}
The second argument of the filter command can be separate from the first; all that matters is that they have the same number of rows.

Is there an algorithm to merge this kind of list?

I have a list like this:
a a . a
b . . a
a . a .
c . a .
or in list format
[['a', 'a', '.', 'a'],
['b', '.', '.', 'a'],
['a', '.', 'a', '.'],
['c', '.', 'a', '.']]
and I want to merge it into [['a','b','c'],['a'],['a'],['a']], or
a,b,c a a a
so that when two consecutive rows share the same letter at any of the four columns, their elements will be merged non-redundantly. I can do it by pairwise comparisons, but wonder if there are formal algorithms to do this?
Thanks.
You didn't specify the language but you can create a HashMap / HashTable for each column and populate it with the column values. (Your key and value will be the same thing.) Populating a HashMap means you cannot have duplicate keys so you will end uo with a list of unique values in each collection. Then pull out the values from each hashMap into an array, or arrays. If the periods in your sample data are actually periods you will have to ignore them as you loop through the array otherwise you will get them as output.
Take a look at Python dictionaries.
the pseudo code for this solution (Python will look similar ;-)
Create a list of dictionaries (list length = # of columns)
Loop over columns
Loop over rows
Insert data into appropriate dictionary
*
Loop over list of dictionaries
Loop over dictionary values
Create new set of arrays with unique values

Using array.delete(array.min) to delete one instance of a value

I'm trying to create a dice game program. I want to select one of the lowest out of five die, move it to a new array and remove the singular occurrence of that lowest dice from the original array. The problem is if there are more than one of the lowest die, this happens:
array = %w[1,1,2,3,4]
new_array=[]
new_array.push(array.min) => 1
array.delete(array.min) => 2,3,4
new_array = array.sort.slice!(0..0)
or
new_array.push(array.sort!.shift)

Resources