All of the option to replace an unknown number of characters - algorithm

I am trying to find an algorithm that for an unknown number of characters in a string, produces all of the options for replacing some characters with stars.
For example, for the string "abc", the output should be:
*bc
a*c
ab*
**c
*b*
a**
***
It is simple enough with a known number of stars, just run through all of the options with for loops, but I'm having difficulties with an all of the options.

Every star combination corresponds to binary number, so you can use simple cycle
for i = 1 to 2^n-1
where n is string length
and set stars to the positions of 1-bits of binary representations of i
for example: i=5=101b => * b *

This is basically a binary increment problem.
You can create a vector of integer variables to represent a binary array isStar and for each iteration you "add one" to the vector.
bool AddOne (int* isStar, int size) {
isStar[size - 1] += 1
for (i = size - 1; i >= 0; i++) {
if (isStar[i] > 1) {
if (i = 0) { return true; }
isStar[i] = 0;
isStar[i - 1] += 1;
}
}
return false;
}
That way you still have the original string while replacing the characters

This is a simple binary counting problem, where * corresponds to a 1 and the original letter to a 0. So you could do it with a counter, applying a bit mask to the string, but it's just as easy to do the "counting" in place.
Here's a simple implementation in C++:
(Edit: The original question seems to imply that at least one character must be replaced with a star, so the count should start at 1 instead of 0. Or, in the following, the post-test do should be replaced with a pre-test for.)
#include <iostream>
#include <string>
// A cleverer implementation would implement C++'s iterator protocol.
// But that would cloud the simple logic of the algorithm.
class StarReplacer {
public:
StarReplacer(const std::string& s): original_(s), current_(s) {}
const std::string& current() const { return current_; }
// returns true unless we're at the last possibility (all stars),
// in which case it returns false but still resets current to the
// original configuration.
bool advance() {
for (int i = current_.size()-1; i >= 0; --i) {
if (current_[i] == '*') current_[i] = original_[i];
else {
current_[i] = '*';
return true;
}
}
return false;
}
private:
std::string original_;
std::string current_;
};
int main(int argc, const char** argv) {
for (int a = 1; a < argc; ++a) {
StarReplacer r(argv[a]);
do {
std::cout << r.current() << std::endl;
} while (r.advance());
std::cout << std::endl;
}
return 0;
}

Related

Recursive algorithm to find all possible solutions in a nonogram row

I am trying to write a simple nonogram solver, in a kind of bruteforce way, but I am stuck on a relatively easy task. Let's say I have a row with clues [2,3] that has a length of 10
so the solutions are:
$$-$$$----
$$--$$$---
$$---$$$--
$$----$$$-
$$-----$$$
-$$----$$$
--$$---$$$
---$$--$$$
----$$-$$$
-$$---$$$-
--$$-$$$--
I want to find all the possible solutions for a row
I know that I have to consider each block separately, and each block will have an availible space of n-(sum of remaining blocks length + number of remaining blocks) but I do not know how to progress from here
Well, this question already have a good answer, so think of this one more as an advertisement of python's prowess.
def place(blocks,total):
if not blocks: return ["-"*total]
if blocks[0]>total: return []
starts = total-blocks[0] #starts = 2 means possible starting indexes are [0,1,2]
if len(blocks)==1: #this is special case
return [("-"*i+"$"*blocks[0]+"-"*(starts-i)) for i in range(starts+1)]
ans = []
for i in range(total-blocks[0]): #append current solutions
for sol in place(blocks[1:],starts-i-1): #with all possible other solutiona
ans.append("-"*i+"$"*blocks[0]+"-"+sol)
return ans
To test it:
for i in place([2,3,2],12):
print(i)
Which produces output like:
$$-$$$-$$---
$$-$$$--$$--
$$-$$$---$$-
$$-$$$----$$
$$--$$$-$$--
$$--$$$--$$-
$$--$$$---$$
$$---$$$-$$-
$$---$$$--$$
$$----$$$-$$
-$$-$$$-$$--
-$$-$$$--$$-
-$$-$$$---$$
-$$--$$$-$$-
-$$--$$$--$$
-$$---$$$-$$
--$$-$$$-$$-
--$$-$$$--$$
--$$--$$$-$$
---$$-$$$-$$
This is what i got:
#include <iostream>
#include <vector>
#include <string>
using namespace std;
typedef std::vector<bool> tRow;
void printRow(tRow row){
for (bool i : row){
std::cout << ((i) ? '$' : '-');
}
std::cout << std::endl;
}
int requiredCells(const std::vector<int> nums){
int sum = 0;
for (int i : nums){
sum += (i + 1); // The number + the at-least-one-cell gap at is right
}
return (sum == 0) ? 0 : sum - 1; // The right-most number don't need any gap
}
bool appendRow(tRow init, const std::vector<int> pendingNums, unsigned int rowSize, std::vector<tRow> &comb){
if (pendingNums.size() <= 0){
comb.push_back(init);
return false;
}
int cellsRequired = requiredCells(pendingNums);
if (cellsRequired > rowSize){
return false; // There are no combinations
}
tRow prefix;
int gapSize = 0;
std::vector<int> pNumsAux = pendingNums;
pNumsAux.erase(pNumsAux.begin());
unsigned int space = rowSize;
while ((gapSize + cellsRequired) <= rowSize){
space = rowSize;
space -= gapSize;
prefix.clear();
prefix = init;
for (int i = 0; i < gapSize; ++i){
prefix.push_back(false);
}
for (int i = 0; i < pendingNums[0]; ++i){
prefix.push_back(true);
space--;
}
if (space > 0){
prefix.push_back(false);
space--;
}
appendRow(prefix, pNumsAux, space, comb);
++gapSize;
}
return true;
}
std::vector<tRow> getCombinations(const std::vector<int> row, unsigned int rowSize) {
std::vector<tRow> comb;
tRow init;
appendRow(init, row, rowSize, comb);
return comb;
}
int main(){
std::vector<int> row = { 2, 3 };
auto ret = getCombinations(row, 10);
for (tRow r : ret){
while (r.size() < 10)
r.push_back(false);
printRow(r);
}
return 0;
}
And my output is:
$$-$$$----
$$--$$$---
$$---$$$--
$$----$$$--
$$-----$$$
-$$-$$$----
-$$--$$$--
-$$---$$$-
-$$----$$$-
--$$-$$$--
--$$--$$$-
--$$---$$$
---$$-$$$-
---$$--$$$
----$$-$$$
For sure, this must be absolutely improvable.
Note: i did't test it more than already written case
Hope it works for you

Sort odd and even numbers separatedly and move all odd numbers in front

For example, if the input array is
832461905
The output is
1357902468
I think this can be done in two steps
1) sort data
012345678
2) move odd numbers in front of even numbers by preserving order
To do so, we can have two pointers
Initially one points to the beginning and the other points to the end
Move the head util even numbers are found
The move the tail until odd numbers are found
Swap data at the pointers
Do the above until the two pointers meet
My question is if we can solve the problem by using one step rather than two
All you need is a little comp-function for sorting:
bool comp(int x, int y)
{
if (x % 2 == y % 2) return x < y;
return x % 2 > y % 2;
}
...
sort(your_array.begin(), your_array.end(), comp);
Yes, it can be done in one step.
Write your own comparison function, and use std::sort in C++:
sort(data.begin(),data.end(),comp);
bool comp(int x,int y)
{
if (x%2==0)
{
if(y%2==0)
{
return x<y; // if both are even
}
else
{
return false; // if only x is even
}
}
else
{
if(y%2==0)
{
return true;
}
else
{
return x<y;
}
}
}
Under the <algorithm> library in C++ you can use sort to order the numbers and then stable_partition to separate by odd and even.
Like so:
auto arr = std::valarray<int>{8,3,2,4,6,1,9,0,5};
std::sort(std::begin(arr), std::end(arr));
std::stable_partition(std::begin(arr), std::end(arr), [](int a){ return a % 2; });
Resulting in a rather succinct solution.
I am considering you are familiar with C++. See my code snippet, and yes it can be done in a step:
#include <iostream>
#include <stdio.h>
#include <algorithm>
bool function(int a, int b) {
if(a%2 != b%2) { /* When one is even and another is odd */
if(a&1) {
return true;
} else {
return false;
}
} else { /* When both are either odd or even */
return (a<b);
}
}
int main() {
int input[10005]; /* Input array */
int n = -1, i;
/* Take the input */
while(scanf("%i", &input[++n]) != EOF);
/* Sort according to desire condition */
std::sort(input, input+n, function);
/* Time to print out the values */
for(i=0; i<n; i++) {
std::cout << input[i] << " ";
}
return 0;
}
Any confusion, comments most welcome.

Algorithm to print all permutations with repetition of numbers

I have successfully designed the algorithm to print all the permutations with the repetition of numbers. But the algorithm which I have designed has a flaw. It works only if the chars of the string are unique.
Can someone help me out in extending the algorithm for the case where chars of the string may not be unique..
My code so far :
#include<cstdio>
#include<cstdlib>
#include<cstring>
#include<climits>
#include<iostream>
using namespace std;
void _perm(char *arr, char*result, int index)
{
static int count = 1;
if (index == strlen(arr))
{
cout << count++ << ". " << result << endl;
return;
}
for (int i = 0; i < strlen(arr); i++)
{
result[index] = arr[i];
_perm(arr, result, index + 1);
}
}
int compare(const void *a, const void *b)
{
return (*(char*)a - *(char*)b);
}
void perm(char *arr)
{
int n = strlen(arr);
if (n == 0)
return;
qsort(arr, n, sizeof(char), compare);
char *data = new char[n];
_perm(arr, data, 0);
free(data);
return;
}
int main()
{
char arr[] = "BACD";
perm(arr);
return 0;
}
I am printing the output strings in lexicographically sorted way.
I am referring to the example.3 from this page.
http://www.vitutor.com/statistics/combinatorics/permutations_repetition.html
Thanks.
Your code doesn't print permutations, but four draws from the string pool with repetition. It will produce 4^4 == 256 combinations, one of which is "AAAA".
The code Karnuakar linked to will give you permutations of a string, but without distinguishing between the multiple occurrences of certain letters. You need some means to prevent recursing with the same letter in each recursion step. In C++, this can be done with a set.
The example code below uses a typical C string, but uses the terminating '\0' to detect the end. The C-string functions from <cstring> are not needed. The output will not be sorted unless the original string was sorted.
#include <iostream>
#include <algorithm>
#include <set>
using namespace std;
void perm(char *str, int index = 0)
{
std::set<char> used;
char *p = str + index;
char *q = p;
if (*p == '\0') {
std::cout << str << std::endl;
return;
}
while (*q) {
if (used.find(*q) == used.end()) {
std::swap(*p, *q);
perm(str, index + 1);
std::swap(*p, *q);
used.insert(*q);
}
q++;
}
}
int main()
{
char arr[] = "AAABB";
perm(arr);
return 0;
}
This will produce 5! == 120 permutations for "ABCDE", but only 5! / (2! 3!) == 10 unique permutations for "AAABB". It will also create the 1260 permutations from the linked exercise.

Parsing morse code

I am trying to solve this problem.
The goal is to determine the number of ways a morse string can be interpreted, given a dictionary of word.
What I did is that I first "translated" words from my dictionary into morse. Then, I used a naive algorithm, searching for all the ways it can be interpreted recursively.
#include <iostream>
#include <vector>
#include <map>
#include <string>
#include <iterator>
using namespace std;
string morse_string;
int morse_string_size;
map<char, string> morse_table;
unsigned int sol;
void matches(int i, int factor, vector<string> &dictionary) {
int suffix_length = morse_string_size-i;
if (suffix_length <= 0) {
sol += factor;
return;
}
map<int, int> c;
for (vector<string>::iterator it = dictionary.begin() ; it != dictionary.end() ; it++) {
if (((*it).size() <= suffix_length) && (morse_string.substr(i, (*it).size()) == *it)) {
if (c.find((*it).size()) == c.end())
c[(*it).size()] = 0;
else
c[(*it).size()]++;
}
}
for (map<int, int>::iterator it = c.begin() ; it != c.end() ; it++) {
matches(i+it->first, factor*(it->second), dictionary);
}
}
string encode_morse(string s) {
string ret = "";
for (unsigned int i = 0 ; i < s.length() ; ++i) {
ret += morse_table[s[i]];
}
return ret;
}
int main() {
morse_table['A'] = ".-"; morse_table['B'] = "-..."; morse_table['C'] = "-.-."; morse_table['D'] = "-.."; morse_table['E'] = "."; morse_table['F'] = "..-."; morse_table['G'] = "--."; morse_table['H'] = "...."; morse_table['I'] = ".."; morse_table['J'] = ".---"; morse_table['K'] = "-.-"; morse_table['L'] = ".-.."; morse_table['M'] = "--"; morse_table['N'] = "-."; morse_table['O'] = "---"; morse_table['P'] = ".--."; morse_table['Q'] = "--.-"; morse_table['R'] = ".-."; morse_table['S'] = "..."; morse_table['T'] = "-"; morse_table['U'] = "..-"; morse_table['V'] = "...-"; morse_table['W'] = ".--"; morse_table['X'] = "-..-"; morse_table['Y'] = "-.--"; morse_table['Z'] = "--..";
int T, N;
string tmp;
vector<string> dictionary;
cin >> T;
while (T--) {
morse_string = "";
cin >> morse_string;
morse_string_size = morse_string.size();
cin >> N;
for (int j = 0 ; j < N ; j++) {
cin >> tmp;
dictionary.push_back(encode_morse(tmp));
}
sol = 0;
matches(0, 1, dictionary);
cout << sol;
if (T)
cout << endl << endl;
}
return 0;
}
Now the thing is that I only have 3 seconds of execution time allowed, and my algorithm won't work under this limit of time.
Is this the good way to do this and if so, what am I missing ? Otherwise, can you give some hints about what is a good strategy ?
EDIT :
There can be at most 10 000 words in the dictionary and at most 1000 characters in the morse string.
A solution that combines dynamic programming with a rolling hash should work for this problem.
Let's start with a simple dynamic programming solution. We allocate an vector which we will use to store known counts for prefixes of morse_string. We then iterate through morse_string and at each position we iterate through all words and we look back to see if they can fit into morse_string. If they can fit then we use the dynamic programming vector to determine how many ways we could have build the prefix of morse_string up to i-dictionaryWord.size()
vector<long>dp;
dp.push_back(1);
for (int i=0;i<morse_string.size();i++) {
long count = 0;
for (int j=1;j<dictionary.size();j++) {
if (dictionary[j].size() > i) continue;
if (dictionary[j] == morse_string.substring(i-dictionary[j].size(),i)) {
count += dp[i-dictionary[j].size()];
}
}
dp.push_back(count);
}
result = dp[morse_code.size()]
The problem with this solution is that it is too slow. Let's say that N is the length of morse_string and M is the size of the dictionary and K is the size of the largest word in the dictionary. It will do O(N*M*K) operations. If we assume K=1000 this is about 10^10 operations which is too slow on most machines.
The K cost came from the line dictionary[j] == morse_string.substring(i-dictionary[j].size(),i)
If we could speed up this string matching to constant or log complexity we would be okay. This is where rolling hashing comes in. If you build a rolling hash array of morse_string then the idea is that you can compute the hash of any substring of morse_string in O(1). So you could then do hash(dictionary[j]) == hash(morse_string.substring(i-dictionary[j].size(),i))
This is good but in the presence of imperfect hashing you could have multiple words from the dictionary with the same hash. That would mean that after getting a hash match you would still need to match the strings as well as the hashes. In programming contests, people often assume perfect hashing and skip the string matching. This is often a safe bet especially on a small dictionary. In case it doesn't produce a perfect hashing (which you can check in code) you can always adjust your hash function slightly and maybe the adjusted hash function will produce a perfect hashing.

Filter only digit sequences containing a given set of digits

I have a large list of digit strings like this one. The individual strings are relatively short (say less than 50 digits).
data = [
'300303334',
'53210234',
'123456789',
'5374576807063874'
]
I need to find out a efficient data structure (speed first, memory second) and algorithm which returns only those strings that are composed of a given set of digits.
Example results:
filter(data, [0,3,4]) = ['300303334']
filter(data, [0,1,2,3,4,5]) = ['300303334', '53210234']
The data list will usually fit into memory.
For each digit, precompute a postings list that don't contain the digit.
postings = [[] for _ in xrange(10)]
for i, d in enumerate(data):
for j in xrange(10):
digit = str(j)
if digit not in d:
postings[j].append(i)
Now, to find all strings that contain, for example, just the digits [1, 3, 5] you can merge the postings lists for the other digits (ie: 0, 2, 4, 6, 7, 8, 9).
def intersect_postings(p0, p1):
i0, i1 = next(p0), next(p1)
while True:
if i0 == i1:
yield i0
i0, i1 = next(p0), next(p1)
elif i0 < i1: i0 = next(p0)
else: i1 = next(p1)
def find_all(digits):
p = None
for d in xrange(10):
if d not in digits:
if p is None: p = iter(postings[d])
else: p = intersect_postings(p, iter(postings[d]))
return (data[i] for i in p) if p else iter(data)
print list(find_all([0, 3, 4]))
print list(find_all([0, 1, 2, 3, 4, 5]))
A string can be encoded by a 10-bit number. There are 2^10, or 1,024 possible values.
So create a dictionary that uses an integer for a key and a list of strings for the value.
Calculate the value for each string and add that string to the list of strings for that value.
General idea:
Dictionary Lookup;
for each (string in list)
value = 0;
for each character in string
set bit N in value, where N is the character (0-9)
Lookup[value] += string // adds string to list for this value in dictionary
Then, to get a list of the strings that match your criteria, just compute the value and do a direct dictionary lookup.
So if the user asks for strings that contain only 3, 5, and 7:
value = (1 << 3) || (1 << 5) || (1 << 7);
list = Lookup[value];
Note that, as Matt pointed out in comment below, this will only return strings that contain all three digits. So, for example, it wouldn't return 37. That seems like a fatal flaw to me.
Edit
If the number of symbols you have to deal with is very large, then the number of possible combinations becomes too large for this solution to be practical.
With a large number of symbols, I'd recommend an inverted index as suggested in the comments, combined with a secondary filter that removes the strings that contain extraneous digits.
Consider a function f which constructs a bitmask for each string with bit i set if digit i is in the string.
For example,
f('0') = 0b0000000001
f('00') = 0b0000000001
f('1') = 0b0000000010
f('1100') = 0b0000000011
Then I suggest storing a list of strings for each bitmask.
For example,
Bitmask 0b0000000001 -> ['0','00']
Once you have prepared this data structure (which is the same size as your original list), you can then easily access all the strings for a particular filter by accessing all lists where the bitmask is a subset of the digits in your filter.
So for your example of filter [0,3,4] you would return the lists from:
Strings containing just 0
Strings containing just 3
Strings containing just 4
Strings containing 0 and 3
Strings containing 0 and 4
Strings containing 3 and 4
Strings containing 0 and 3 and 4
Example Python Code
from collections import defaultdict
import itertools
raw_data = [
'300303334',
'53210234',
'123456789',
'5374576807063874'
]
def preprocess(raw_data):
data = defaultdict(list)
for s in raw_data:
bitmask = 0
for digit in s:
bitmask |= 1<<int(digit)
data[bitmask].append(s)
return data
def filter(data,mask):
for r in range(len(mask)):
for m in itertools.combinations(mask,r+1):
bitmask = sum(1<<digit for digit in m)
for s in data[bitmask]:
yield s
data = preprocess(raw_data)
for a in filter(data, [0,1,2,3,4,5]):
print a
Just for kicks, I have coded up Jim's lovely algorithm and the Perl is here if anyone wants to play with it. Please do not accept this as an answer or anything, pass all credit to Jim:
#!/usr/bin/perl
use strict;
use warnings;
my $Debug=1;
my $Nwords=1000;
my ($word,$N,$value,$i,$j,$k);
my (#dictionary,%Lookup);
################################################################################
# Generate "words" with random number of characters 5-30
################################################################################
print "DEBUG: Generating $Nwords word dictionary\n" if $Debug;
for($i=0;$i<$Nwords;$i++){
$j = rand(25) + 5; # length of this word
$word="";
for($k=0;$k<$j;$k++){
$word = $word . int(rand(10));
}
$dictionary[$i]=$word;
print "$word\n" if $Debug;
}
# Add some obvious test cases
$dictionary[++$i]="0" x 50;
$dictionary[++$i]="1" x 50;
$dictionary[++$i]="2" x 50;
$dictionary[++$i]="3" x 50;
$dictionary[++$i]="4" x 50;
$dictionary[++$i]="5" x 50;
$dictionary[++$i]="6" x 50;
$dictionary[++$i]="7" x 50;
$dictionary[++$i]="8" x 50;
$dictionary[++$i]="9" x 50;
$dictionary[++$i]="0123456789";
################################################################################
# Encode words
################################################################################
for $word (#dictionary){
$value=0;
for($i=0;$i<length($word);$i++){
$N=substr($word,$i,1);
$value |= 1 << $N;
}
push(#{$Lookup{$value}},$word);
print "DEBUG: $word encoded as $value\n" if $Debug;
}
################################################################################
# Do lookups
################################################################################
while(1){
print "Enter permitted digits, separated with commas: ";
my $line=<STDIN>;
my #digits=split(",",$line);
$value=0;
for my $d (#digits){
$value |= 1<<$d;
}
print "Value: $value\n";
print join(", ",#{$Lookup{$value}}),"\n\n" if defined $Lookup{$value};
}
I like Jim Mischel's approach. It has pretty efficient look up and bounded memory usage. Code in C follows:
#include <stdlib.h>
#include <stdint.h>
#include <stdio.h>
#include <string.h>
#include <readline/readline.h>
#include <readline/history.h>
enum {
zero = '0',
nine = '9',
numbers = nine - zero + 1,
masks = 1 << numbers,
};
typedef uint16_t mask;
struct list {
char *s;
struct list *next;
};
typedef struct list list_cell;
typedef struct list *list;
static inline int is_digit(char c) { return c >= zero && c <= nine; }
static inline mask char2mask(char c) { return 1 << (c - zero); }
static inline mask add_char2mask(mask m, char c) {
return m | (is_digit(c) ? char2mask(c) : 0);
}
static inline int is_set(mask m, mask n) { return (m & n) != 0; }
static inline int is_set_char(mask m, char c) { return is_set(m, char2mask(c)); }
static inline int is_submask(mask sub, mask m) { return (sub & m) == sub; }
static inline char *sprint_mask(char buf[11], mask m) {
char *s = buf;
char i;
for(i = zero; i <= nine; i++)
if(is_set_char(m, i)) *s++ = i;
*s = 0;
return buf;
}
static inline mask get_mask(char *s) {
mask m=0;
for(; *s; s++)
m = add_char2mask(m, *s);
return m;
}
static inline int is_empty(list l) { return !l; }
static inline list insert(list *l, char *s) {
list cell = (list)malloc(sizeof(list_cell));
cell->s = s;
cell->next = *l;
return *l = cell;
}
static void *foreach(void *f(char *, void *), list l, void *init) {
for(; !is_empty(l); l = l->next)
init = f(l->s, init);
return init;
}
struct printer_state {
int first;
FILE *f;
};
static void *prin_list_member(char *s, void *data) {
struct printer_state *st = (struct printer_state *)data;
if(st->first) {
fputs(", ", st->f);
} else
st->first = 1;
fputs(s, st->f);
return data;
}
static void print_list(list l) {
struct printer_state st = {.first = 0, .f = stdout};
foreach(prin_list_member, l, (void *)&st);
putchar('\n');
}
static list *init_lu(void) { return (list *)calloc(sizeof(list), masks); }
static list *insert2lu(list lu[masks], char *s) {
mask i, m = get_mask(s);
if(m) // skip string without any number
for(i = m; i < masks; i++)
if(is_submask(m, i))
insert(lu+i, s);
return lu;
}
int usage(const char *name) {
fprintf(stderr, "Usage: %s filename\n", name);
return EXIT_FAILURE;
}
#define handle_error(msg) \
do { perror(msg); exit(EXIT_FAILURE); } while (0)
static inline void chomp(char *s) { if( (s = strchr(s, '\n')) ) *s = '\0'; }
list *load_file(FILE *f) {
char *line = NULL;
size_t len = 0;
ssize_t read;
list *lu = init_lu();
for(; (read = getline(&line, &len, f)) != -1; line = NULL) {
chomp(line);
insert2lu(lu, line);
}
return lu;
}
void read_reqs(list *lu) {
char *line;
char buf[11];
for(; (line = readline("> ")); free(line))
if(*line) {
add_history(line);
mask m = get_mask(line);
printf("mask: %s\nstrings: ", sprint_mask(buf, m));
print_list(lu[m]);
};
putchar('\n');
}
int main(int argc, const char* argv[] ) {
const char *name = argv[0];
FILE *f;
list *lu;
if(argc != 2) return usage(name);
f = fopen(argv[1], "r");
if(!f) handle_error("open");
lu = load_file(f);
fclose(f);
read_reqs(lu);
return EXIT_SUCCESS;
}
To compile use
gcc -lreadline -o digitfilter digitfilter.c
And test run:
$ cat data.txt
300303334
53210234
123456789
5374576807063874
$ ./digitfilter data.txt
> 034
mask: 034
strings: 300303334
> 0,1,2,3,4,5
mask: 012345
strings: 53210234, 300303334
> 0345678
mask: 0345678
strings: 5374576807063874, 300303334
Put each value into a set-- Eg.: '300303334'={3, 0, 4}.
Since the length of your data items are bound by a constant (50),
you can do these at O(1) time for each item using Java HashSet. The overall complexity of this phase adds up to O(n).
For each filter set, use containsAll() of HashSet to see whether
each of these data items is a subset of your filter. Takes O(n).
Takes O(m*n) in the overall where n is the number of data items and m the number of filters.

Resources