Create customer audience from email md5 hash for facebook ad - facebook-ads-api

My organisation is very sensitive about the customer information, and is unwilling to reveal customer information (email) to outside system. However I am trying to target these customer through facebook ads. For this purpose organisation is allowing me to use only md5 hash (or any other well known hashing algorithm) of the email.
Since I am unable to provide plain-text email list, is it possible to create a custom audience from list of md5 hashed emails (not plain-text).

You can use facebook custom audience feature by providing a sha256 hashed value and use those to create custom audience using facebook-ads-api.
here are some relevant links to get you started :
https://developers.facebook.com/docs/marketing-api/audiences-api/
and here are some notes if link clicking is not your thing:
Hashing and Normalization for Multi-Key
You must hash your data as SHA256; we don't support other hashing mechanisms. This is required for all data except External Identifiers, App User IDs and Page Scoped User IDs. Before hashing, normalize your data.
Email addresses - Use key EMAIL. Trimming leading and trailing whitespace and convert all characters to lowercase.
Provide SHA256 values for normalized keys and HEX representations of this value, using lowercase for A through F. The hash function in PHP converts normalized email and phone number:
example : hash("sha256", "mary#example.com")
f1904cf1a9d73a55fa5de0ac823c4403ded71afd4c3248d00bdcd0866552bb79

yes, this is a very useful feature. It can be used for advanced matching as well and not only for FB but for TikTok and Google as well. However this can be a tricky area... in countries where GDPR or similar laws are applied, sometimes this can be considered as not legal. Although the data is hashed with sha256 and transferred as such, it can not be considered as "data processing on behalf of a controller" - according to the German court. There was a case in Germany where the court decided that this can not be used.
Definitely talk to your legal department and with your data protection officer. More info about this topic you can find here:
https://floyk.com/en/post/setup-facebook-advanced-matching-for-websites

Related

Does a Punycode domain name (UName) store the IDN table used?

I've created a domain name such as: même.vip
I can see in the database, that the domain name has been registered with IDN table: "fr".
However, 'ê' can be Portuguese, Norwegian, etc...
I am trying to understand who is assuming the IDN table here...
I can see the EPP transaction - it is not using the IDN extension and therefore cannot supply an IDN table to the server, even if it wanted to
I cannot access the code that populated that DB record
Therefore, my best chance is to know if the Punycode domain name contains information on which table was used. If not: then I know it's the DB or some service at the registry, after the EPP command.
(Of course, if the punycode DOES contain the IDN table, then I have more digging to do!)
Does a Punycode domain name (UName) store the IDN table used?
TL;DR: No.
You are mixing multiple things, but it is difficult to summarize everything (I did a very detailed answer at https://webmasters.stackexchange.com/a/122160/75842 which should help you).
For the computers, ê being either Portuguese or Norwegian does not make a difference at the DNS level. In the same way that at the Unicode level, ê is
"U+00EA LATIN SMALL LETTER E WITH CIRCUMFLEX" that is just defined as a "Latin" character, irrespective to which language might use it.
In short:
the IETF invented the Punycode algorithm, and more precisely the IDNA standard just to make sure that people could use (almost) any character in their domain name. As such the algorithm is just a translation from "any Unicode string" to "an ASCII string starting with xn--"
The domain name industry, with ICANN and all registries, then decide on rules on top of that. For example there is a major rule "you can not mix characters from multiple scripts in the same string", to avoid IDN homograph attacks mostly (so not really a technical constraint); my answer above gets in full details on this.
At the EPP level, various actors created various extensions, there is no real standardized "IDN" specification here. Which is also why you will find people speaking about "scripts", other about "languages", other about "repertoire", etc. It is a mess (Unicode only speaks about scripts, not languages). Some registries do not use any extension, while others do. Some want you to always pass an IDN "table" (aka script/language/whatever) reference, some will require it only in some cases. For example look at Verisign IDN practices at https://www.verisign.com/en_US/channel-resources/domain-registry-products/idn/idn-policy/registration-rules/index.xhtml; It boils down to "all IDN registrations need a language tag; some of them are attached to specific list of possible characters"
You can find in theory all but in practice only most of IDN tables existing at https://www.iana.org/domains/idn-tables and you can see they are per registry, showing that this extra information is really not encoded in the ASCII form of the domain name, after conversion by Punycode algorithm.
I am trying to understand who is assuming the IDN table here...
There should be no assumption (either it is given by registrar or not given) or there is no IDN table needed (the registry will just do the Punycode conversion in reverse and decide, based on characters found, which table it should be in).
I can see the EPP transaction - it is not using the IDN extension and therefore cannot supply an IDN table to the server, even if it wanted to
Which registry? If you are a registrar, in practice the registry should be able to help you and answer this kind of questions. Note that most of the time (I could write "all the time", but I am not sure no counter example exists or at least I have none in mind right now), during EPP domain:check you just pass the name (in ASCII form) without any IDN extension, while you pass the IDN extension, if any, during the domain:create. Which also means that the domain:check might not get you the proper full reply, just because at that point not everything is known.
See these EPP documents on IDN extensions:
https://datatracker.ietf.org/doc/html/draft-ietf-eppext-idnmap-02
https://datatracker.ietf.org/doc/html/draft-wilcox-cira-idn-eppext
https://tools.ietf.org/id/draft-gould-idn-table-07.html
https://datatracker.ietf.org/doc/html/draft-sienkiewicz-epp-idn-00

Unified IDs of geographical locations

I want to use locations' titles in my app, like 'Chicago, Illinois, USA', or 'Surrey, British Columbia, Canada', or one of Springfields.
I am going to add them to the DB one by one during the app lifecycle, no need to add all at once, and think that it would be nice to identify them all with unique IDs. I could just go from 1 to n, as a key.
But for future potential flexibility I could use some criteria to make sure I will get that very Springfield when I decode and enter its ID somewhere, like Google.
May be I can use lat/lon data from public sources, e.g. Wikipedia and turn the pair into a key? Or may be there are already some IDs assigned by authorities or some agency that are kind of a standard?
One possibility is to use a GeoHash of the location. This would give you a unique code for each well positioned location you are using. An added bonus is that it would allow you to determine how close they were to each other too.

Security Code generation's algorithm

Alright, here's the story:
I'm getting married soon, and I'd like to create a website (or an app).
Obviously, I'd like that only guests could access to it.
So I was thinking about a system where it would require a security code to sign up.
The problem is that I do not trust anyone not to be silent about the code, so I was thinking about giving a different code for every couple (or family) of invited people.
On the sign up form, I would then verify that the entered code has not already been used.
But since I don't know who will sign up to the app, and I don't really have time to manually register each guest, I won't have a database with what code has been provided to whom information.
So, I need an algorithm to generate a random security code, and the reversed one, to check if a given string is a validate security code
I need the algorithm to be complex enough so people could not guess what's the magic behing the code they received. (I know, it feels pretty paranoid)
The generated Securiy Code should be pretty simple, like 6 to 8 characters (mix of digits, upper and lower case letters)
The main issue is that I have no clue how to perform a reliable system to generate and validate a security codes.
I feel like I should have a secret key stored on the server side, that would be necessary to generate a code, and I would have to find it back if a given string is a valid code.
Let's say secret is my private key.
The generation algorithm would be something like secret + whatever = generated code (where the + whatever operation remains to define).
But then how could I check a given string? string - whatever =? secret would be the solution (where - whatever is the reverses operation of + whatever).
Well, I actually have no clue of what whatever could (or should) be.
Do you have any advice or guidance ?
For the technical part, I will probably code this in JS (with a NodeJS server).
But as I'm talking about the concept of security code generation, any pseudo-code will do the job.
Generate a hash of the person's email address (capitalized) and make the code the first n-characters. So, for example, if your email address is TOUPYE#GMAIL.COM then the SHA-256 hash would be: 038122aedbf777b8c7c3aaed14ae7c08249a9d47f82f4455a0d667cacc57d383 so your code would be "038122". Generate a list of codes for each person/family. If someone has no email address use the telephone number. If they do not have a telephone, use their address.

standardized international phone number field format as a string

I'd like to store phone numbers as unique user ids in my database/app which will initially roll out in the United States but could expand to other countries eventually.
My question is when storing phone numbers, what's a resilient way to store the number as a string so that I don't have any duplicate numbers from other countries overlap.
My initial thought is to do it this way
+1(212)555-5555
+{countryCode}({areaCode}){{subscriberCode}} *formatted with a hyphen for u.s numbers
Does that seem reasonable or are there any pitfalls to that? Should spaces be used? For instance I can't imagine other countries would use spaces or parenthesis in their subscriber codes... but maybe they do? It would also be nice if it followed the standard output format from ios and android phones' address books.
Here's what I'd say:
Use the plus. It indicates for certain that the country code follows and the number is not in a local format. You could also not store the plus and make an internal decision that all phone numbers will be stored with the country code, thus obviating the need for the plus.
Don't use any formatting in the storage of the number. Formatting is irrelevant when it comes to dialing and it makes searching and comparing more difficult.
Use a gem like phony_rails or phoney to format the number to local conventions when displaying.
So it looks like there is an international standard
http://en.wikipedia.org/wiki/E.164
And a node.js library that can format to that standard
https://github.com/aftership/node-phone

Encryption puzzle / How to create a PassStub for a Remote Assistance ticket

I am trying to create a ticket for Remote Assistance. Part of that requires creating a PassStub parameter. As of the documentation:
http://msdn.microsoft.com/en-us/library/cc240115(PROT.10).aspx
PassStub: The encrypted novice computer's password string. When the Remote
Assistance Connection String is sent as a file over e-mail, to provide additional security, a
password is used.<16>
In part 16 they detail how to create as PassStub.
In Windows XP and Windows Server 2003, when a password is used, it is encrypted using
PROV_RSA_FULL predefined Cryptographic provider with MD5 hashing and CALG_RC4, the RC4
stream encryption algorithm.
As PassStub looks like this in the file:
PassStub="LK#6Lh*gCmNDpj"
If you want to generate one yourself run msra.exe in Vista or run the Remote Assistance tool in WinXP.
The documentation says this stub is the result of the function CryptEncrypt with the key derived from the password and encrypted with the session id (Those are also in the ticket file).
The problem is that CryptEncrypt produces a binary output way larger than the 15 byte PassStub. Also the PassStub isn't encoding in any way I've seen before.
Some interesting things about the PassStub encoding. After doing statistical analysis the 3rd char is always a one of: !#$&()+-=#^. Only symbols seen everywhere are: *_ . Otherwise the valid characters are 0-9 a-z A-Z. There are a total of 75 valid characters and they are always 15 bytes.
Running msra.exe with the same password always generates a different PassStub, indicating that it is not a direct hash but includes the rasessionid as they say.
Another idea I've had is that it is not the direct result of CryptEncrypt, but a result of the rasessionid in the MD5 hash. In MS-RA (http://msdn.microsoft.com/en-us/library/cc240013(PROT.10).aspx). The "PassStub Novice" is simply hex encoded, and looks to be the right length. The problem is I have no idea how to go from any hash to way the PassStub looks like.
I am curious, have you already:
considered using ISAFEncrypt::EncryptString(bstrEncryptionkey, bstrInputString) as a higher-level alternative to doing all the dirty work directly with CryptEncrypt? (the tlb is in hlpsvc.exe)
looked inside c:\WINDOWS\pchealth\helpctr\Vendors\CN=Microsoft Corporation,L=Redmond,S=Washington,C=US\Remote Assistance\Escalation\Email\rcscreen9.htm (WinXP) to see what is going on when you pick the Save invitation as a file (Advanced) option and provide a password? (feel free to add alert() calls inside OnSave())

Resources