Why this function return an (owned) value? - memory-management

the code
from: Genie howto repeat a string N times as an string arrayGenie howto repeat a string N times as an string array
def repeatwithsep (e: string, n: int, separator: string): string
var elen = e.length;
var slen = separator.length;
var a = new StringBuilder.sized ((elen * n) + (slen * (n - 1)) + 1);
for var i = 0 to (n - 1)
if i != 0
a.append_len (separator, slen)
a.append_len (e, elen)
return (owned) a.str
var a is a local variable, when a goes out of scope, it will be destroyed.
why this function
return (owned) a.str
what is the difference between
return a.str
return (owned) a.str
what is the benefit of (owned)

return a.str will make a copy of the string using g_strdup, because by default the function result and the StringBuilder will both own a separate copy of the string after the (implicit) assignment.
Since the StringBuilder stored in a will go out of scope and it's copy will thus never be used again this is not desireable / efficient in this case.
Hence the solution is to pass ownership of the string from a.str to the result of the function using the (owned) directive.
BTW: You can easily find this out by compiling both versions with valac -C and comparing the generated C code:
- _tmp21_->str = NULL;
- result = _tmp22_;
+ _tmp23_ = g_strdup (_tmp22_);
+ result = _tmp23_;
(In this comparison the left side was return (owned) a.str and the right side was return a.str)
PS: This is documented in the ownership section of the Vala tutorial and also the corresponding part of the Genie tutorial.
I would also recommend the Reference Handling article.

Related

How can I concatenate strings only if they have passed a logical statement in Swift?

My challenge is twofold:
To pick individual strings from an array of similar strings, but only if a boolean test has been passed first.
"Finally" I need to concatenate any/all of the strings generated into one complete text and the entire code must be in Swift.
Illustration: A back of the envelope code for illustration of logic:
generatedText.text =
case Int1 <= 50 && Int2 == 50
return generatedParagraph1 = pick one string at RANDOM from a an array1 of strings
case Int3 =< 100
return generatedParagraph2 = pick one string at RANDOM from a an array2 of strings
case Int4 == 100
return generatedParagraph3 = pick one string at RANDOM from a an array3 of strings
...etc
default
return "Nothing to report"
and concatenate the individual generatedParagraphs
Attempt: Code picks a random element within stringArray1, 2 and 3.
Example of what the code returns:
---> "Sentence1_c.Sentence2_a.Sentence3_b."
PROBLEM: I need the code to ONLY pick an element if it has first passed a boolean. It means that the final concatenated string (concastString) could be empty, just contain one element, or several depending on how many of the bools were True. Does anyone know how to do this?
import Foundation
var stringArray1 = ["","Sentence1_a.", "Sentence1_b.", "Sentence1_c."]
var stringArray2 = ["","Sentence2_a.", "Sentence2_b.", "Sentence2_c."]
var stringArray3 = ["","Sentence3_a.", "Sentence3_b.", "Sentence3_c."]
let count1 = UInt32(stringArray1.count)-1
let count2 = UInt32(stringArray2.count)-1
let count3 = UInt32(stringArray3.count)-1
var randomNumberOne = Int(arc4random_uniform(count1))+1
var randomNumberTwo = Int(arc4random_uniform(count2))+1
var randomNumberThree = Int(arc4random_uniform(count3))+1
let concatString = stringArray1[randomNumberOne] + stringArray2[randomNumberTwo] + stringArray3[randomNumberThree]
Okay, I didn't pass a Bool, but I show concatenating three random strings from a [String]. I ran this in a playground.
import Foundation
var stringArray = [String]()
for var i = 0; i < 100; i++ {
stringArray.append("text" + "\(i)")
}
func concat (array: [String]) -> String {
let count = UInt32(stringArray.count)
let randomNumberOne = Int(arc4random_uniform(count))
let randomNumberTwo = Int(arc4random_uniform(count))
let randomNumberThree = Int(arc4random_uniform(count))
let concatString = array[randomNumberOne] + array[randomNumberTwo] + array[randomNumberThree]
return concatString
}
let finalString = concat(stringArray)

ldap searchtype substring not working when value is an integer is less than one thousand

I search and users from active directory. My code is below:
List<DirectoryEntry> dirEntries = ActiveDirectoryActions.getListByQuery("(&(objectClass=user)(displayName~=*" + q + "*))");
for (int i = 0; i < dirEntries.Count; i++)
{
SiteSearchResult r = new SiteSearchResult();
r.title = dirEntries[i].Properties["displayName"].Value.ToString();
r.url = "/" + lang + "/directory/user/" + dirEntries[i].Properties["sAMAccountName"].Value.ToString();
r.content = dirEntries[i].Properties["title"].Value.ToString();
result.Add(r);
}
And it is getListByQuery() function
public static List<DirectoryEntry> getListByQuery(string q)
{
DirectorySearcher drSearch = new DirectorySearcher(rootEntry);
drSearch.Filter = "(distinguishedName=" + Config.xml().Root.Elements("active_directory").Elements("root_ou").Select(x => x.Value).FirstOrDefault().ToString() + ")";
DirectoryEntry searchRoot = drSearch.FindAll()[0].GetDirectoryEntry();
drSearch.SearchRoot = searchRoot;
drSearch.Filter = q;
List<DirectoryEntry> r = new List<DirectoryEntry>();
SearchResultCollection sr = drSearch.FindAll();
for (int i = 0; i < sr.Count; i++)
{
r.Add(sr[i].GetDirectoryEntry());
}
return r;
}
Everthing is ok on my local server. But gives error on global server when I search integer value. And that is interesting when the value less than 1000 (<1000) .
[NullReferenceException: Object reference not set to an instance of an
object.] Myproject.Controllers.SearchController.Index(String
lang, String q) in
D:\dotNET\Myproject\Myproject\Controllers\SearchController.cs:60
Help please.
I think there are few issues you need to check
1) Does it make sense to call DirectoryEntry.FindAll() two times? Instead of using DirectorySearcher(rootEntry) you could try to set
string ldapPath = "LDAP://" + Config.xml().Root.Elements("active_directory").Elements("root_ou").Select(x => x.Value).FirstOrDefault().ToString();
DirectoryEntry de = new DirectoryEntry(ldapPath);
DirectorySearcher desearch = new DirectorySearcher(de);
deSearch.Filter = ...
2) Approximate search ~= might be not compatible with a substring search (=*substring*), e.g. for me it does not work. So try to change to =*1000* (without ~)
3) FindAll() in both cases could return null, so you should check for null in both cases.
4) [MSDN]
Due to implementation restrictions, the SearchResultCollection class
cannot release all of its unmanaged resources when it is garbage
collected. To prevent a memory leak, you must call the Dispose method
when the SearchResultCollection object is no longer needed.
So you either need to call sr.Dispose or use a using Statement.

Linq to Objects - query objects for any non-numeric data

I am trying to write some logic to determine if all values of a certain property of an object in a collection are numeric and greater than zero. I can easily write this using ForEach but I'd like to do it using Linq to Object. I tried this:
var result = entity.Reports.Any(
x =>
x.QuestionBlock == _question.QuestionBlock
&& (!string.IsNullOrEmpty(x.Data)) && Int32.TryParse(x.Data, out tempVal)
&& Int32.Parse(x.Data) > 0);
It does not work correctly. I also tried this, hoping that the TryParse() on Int32 will return false the first time it encounter a string that cannot be parsed into an int. But it appears the out param will contain the first value string value that can be parsed into an int.
var result = entity.GranteeReportDataModels.Any(
x =>
x.QuestionBlock == _question.QuestionBlock
&& (!string.IsNullOrEmpty(x.Data)) && Int32.TryParse(x.Data, out tempVal));
Any help is greatly appreciated!
If you want to test if "all" values meet a condition, you should use the All extension method off IEnumerable<T>, not Any. I would write it like this:
var result = entity.Reports.All(x =>
{
int result = 0;
return int.TryParse(x.Data, out result) && result > 0;
});
I don't believe you need to test for an null or empty string, because int.TryPrase will return false if you pass in a null or empty string.
var allDataIsNatural = entity.Reports.All(r =>
{
int i;
if (!int.TryParse(r.Data, out i))
{
return false;
}
return i > 0;
});
Any will return when the first row is true but, you clearly say you would like to check them all.
You can use this extension which tries to parse a string to int and returns a int?:
public static int? TryGetInt(this string item)
{
int i;
bool success = int.TryParse(item, out i);
return success ? (int?)i : (int?)null;
}
Then this query works:
bool all = entity.Reports.All(x => {
if(x.QuestionBlock != _question.QuestionBlockint)
return false;
int? data = x.Data.TryGetInt();
return data.HasValue && data.Value > 0;
});
or more readable (a little bit less efficient):
bool all = entityReports
.All(x => x.Data.TryGetInt().HasValue && x.Data.TryGetInt() > 0
&& x.QuestionBlock == _question.QuestionBlockint);
This approach avoids using a local variable as out parameter which is an undocumented behaviour in Linq-To-Objects and might stop working in future. It's also more readable.

How do I create a URL shortener? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Closed 1 year ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I want to create a URL shortener service where you can write a long URL into an input field and the service shortens the URL to "http://www.example.org/abcdef".
Instead of "abcdef" there can be any other string with six characters containing a-z, A-Z and 0-9. That makes 56~57 billion possible strings.
My approach:
I have a database table with three columns:
id, integer, auto-increment
long, string, the long URL the user entered
short, string, the shortened URL (or just the six characters)
I would then insert the long URL into the table. Then I would select the auto-increment value for "id" and build a hash of it. This hash should then be inserted as "short". But what sort of hash should I build? Hash algorithms like MD5 create too long strings. I don't use these algorithms, I think. A self-built algorithm will work, too.
My idea:
For "http://www.google.de/" I get the auto-increment id 239472. Then I do the following steps:
short = '';
if divisible by 2, add "a"+the result to short
if divisible by 3, add "b"+the result to short
... until I have divisors for a-z and A-Z.
That could be repeated until the number isn't divisible any more. Do you think this is a good approach? Do you have a better idea?
Due to the ongoing interest in this topic, I've published an efficient solution to GitHub, with implementations for JavaScript, PHP, Python and Java. Add your solutions if you like :)
I would continue your "convert number to string" approach. However, you will realize that your proposed algorithm fails if your ID is a prime and greater than 52.
Theoretical background
You need a Bijective Function f. This is necessary so that you can find a inverse function g('abc') = 123 for your f(123) = 'abc' function. This means:
There must be no x1, x2 (with x1 ≠ x2) that will make f(x1) = f(x2),
and for every y you must be able to find an x so that f(x) = y.
How to convert the ID to a shortened URL
Think of an alphabet we want to use. In your case, that's [a-zA-Z0-9]. It contains 62 letters.
Take an auto-generated, unique numerical key (the auto-incremented id of a MySQL table for example).
For this example, I will use 12510 (125 with a base of 10).
Now you have to convert 12510 to X62 (base 62).
12510 = 2×621 + 1×620 = [2,1]
This requires the use of integer division and modulo. A pseudo-code example:
digits = []
while num > 0
remainder = modulo(num, 62)
digits.push(remainder)
num = divide(num, 62)
digits = digits.reverse
Now map the indices 2 and 1 to your alphabet. This is how your mapping (with an array for example) could look like:
0 → a
1 → b
...
25 → z
...
52 → 0
61 → 9
With 2 → c and 1 → b, you will receive cb62 as the shortened URL.
http://shor.ty/cb
How to resolve a shortened URL to the initial ID
The reverse is even easier. You just do a reverse lookup in your alphabet.
e9a62 will be resolved to "4th, 61st, and 0th letter in the alphabet".
e9a62 = [4,61,0] = 4×622 + 61×621 + 0×620 = 1915810
Now find your database-record with WHERE id = 19158 and do the redirect.
Example implementations (provided by commenters)
C++
Python
Ruby
Haskell
C#
CoffeeScript
Perl
Why would you want to use a hash?
You can just use a simple translation of your auto-increment value to an alphanumeric value. You can do that easily by using some base conversion. Say you character space (A-Z, a-z, 0-9, etc.) has 62 characters, convert the id to a base-40 number and use the characters as the digits.
public class UrlShortener {
private static final String ALPHABET = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
private static final int BASE = ALPHABET.length();
public static String encode(int num) {
StringBuilder sb = new StringBuilder();
while ( num > 0 ) {
sb.append( ALPHABET.charAt( num % BASE ) );
num /= BASE;
}
return sb.reverse().toString();
}
public static int decode(String str) {
int num = 0;
for ( int i = 0; i < str.length(); i++ )
num = num * BASE + ALPHABET.indexOf(str.charAt(i));
return num;
}
}
Not an answer to your question, but I wouldn't use case-sensitive shortened URLs. They are hard to remember, usually unreadable (many fonts render 1 and l, 0 and O and other characters very very similar that they are near impossible to tell the difference) and downright error prone. Try to use lower or upper case only.
Also, try to have a format where you mix the numbers and characters in a predefined form. There are studies that show that people tend to remember one form better than others (think phone numbers, where the numbers are grouped in a specific form). Try something like num-char-char-num-char-char. I know this will lower the combinations, especially if you don't have upper and lower case, but it would be more usable and therefore useful.
My approach: Take the Database ID, then Base36 Encode it. I would NOT use both Upper AND Lowercase letters, because that makes transmitting those URLs over the telephone a nightmare, but you could of course easily extend the function to be a base 62 en/decoder.
Here is my PHP 5 class.
<?php
class Bijective
{
public $dictionary = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
public function __construct()
{
$this->dictionary = str_split($this->dictionary);
}
public function encode($i)
{
if ($i == 0)
return $this->dictionary[0];
$result = '';
$base = count($this->dictionary);
while ($i > 0)
{
$result[] = $this->dictionary[($i % $base)];
$i = floor($i / $base);
}
$result = array_reverse($result);
return join("", $result);
}
public function decode($input)
{
$i = 0;
$base = count($this->dictionary);
$input = str_split($input);
foreach($input as $char)
{
$pos = array_search($char, $this->dictionary);
$i = $i * $base + $pos;
}
return $i;
}
}
A Node.js and MongoDB solution
Since we know the format that MongoDB uses to create a new ObjectId with 12 bytes.
a 4-byte value representing the seconds since the Unix epoch,
a 3-byte machine identifier,
a 2-byte process id
a 3-byte counter (in your machine), starting with a random value.
Example (I choose a random sequence)
a1b2c3d4e5f6g7h8i9j1k2l3
a1b2c3d4 represents the seconds since the Unix epoch,
4e5f6g7 represents machine identifier,
h8i9 represents process id
j1k2l3 represents the counter, starting with a random value.
Since the counter will be unique if we are storing the data in the same machine we can get it with no doubts that it will be duplicate.
So the short URL will be the counter and here is a code snippet assuming that your server is running properly.
const mongoose = require('mongoose');
const Schema = mongoose.Schema;
// Create a schema
const shortUrl = new Schema({
long_url: { type: String, required: true },
short_url: { type: String, required: true, unique: true },
});
const ShortUrl = mongoose.model('ShortUrl', shortUrl);
// The user can request to get a short URL by providing a long URL using a form
app.post('/shorten', function(req ,res){
// Create a new shortUrl */
// The submit form has an input with longURL as its name attribute.
const longUrl = req.body["longURL"];
const newUrl = ShortUrl({
long_url : longUrl,
short_url : "",
});
const shortUrl = newUrl._id.toString().slice(-6);
newUrl.short_url = shortUrl;
console.log(newUrl);
newUrl.save(function(err){
console.log("the new URL is added");
})
});
I keep incrementing an integer sequence per domain in the database and use Hashids to encode the integer into a URL path.
static hashids = Hashids(salt = "my app rocks", minSize = 6)
I ran a script to see how long it takes until it exhausts the character length. For six characters it can do 164,916,224 links and then goes up to seven characters. Bitly uses seven characters. Under five characters looks weird to me.
Hashids can decode the URL path back to a integer but a simpler solution is to use the entire short link sho.rt/ka8ds3 as a primary key.
Here is the full concept:
function addDomain(domain) {
table("domains").insert("domain", domain, "seq", 0)
}
function addURL(domain, longURL) {
seq = table("domains").where("domain = ?", domain).increment("seq")
shortURL = domain + "/" + hashids.encode(seq)
table("links").insert("short", shortURL, "long", longURL)
return shortURL
}
// GET /:hashcode
function handleRequest(req, res) {
shortURL = req.host + "/" + req.param("hashcode")
longURL = table("links").where("short = ?", shortURL).get("long")
res.redirect(301, longURL)
}
C# version:
public class UrlShortener
{
private static String ALPHABET = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
private static int BASE = 62;
public static String encode(int num)
{
StringBuilder sb = new StringBuilder();
while ( num > 0 )
{
sb.Append( ALPHABET[( num % BASE )] );
num /= BASE;
}
StringBuilder builder = new StringBuilder();
for (int i = sb.Length - 1; i >= 0; i--)
{
builder.Append(sb[i]);
}
return builder.ToString();
}
public static int decode(String str)
{
int num = 0;
for ( int i = 0, len = str.Length; i < len; i++ )
{
num = num * BASE + ALPHABET.IndexOf( str[(i)] );
}
return num;
}
}
You could hash the entire URL, but if you just want to shorten the id, do as marcel suggested. I wrote this Python implementation:
https://gist.github.com/778542
Take a look at https://hashids.org/ it is open source and in many languages.
Their page outlines some of the pitfalls of other approaches.
If you don't want re-invent the wheel ... http://lilurl.sourceforge.net/
// simple approach
$original_id = 56789;
$shortened_id = base_convert($original_id, 10, 36);
$un_shortened_id = base_convert($shortened_id, 36, 10);
alphabet = map(chr, range(97,123)+range(65,91)) + map(str,range(0,10))
def lookup(k, a=alphabet):
if type(k) == int:
return a[k]
elif type(k) == str:
return a.index(k)
def encode(i, a=alphabet):
'''Takes an integer and returns it in the given base with mappings for upper/lower case letters and numbers 0-9.'''
try:
i = int(i)
except Exception:
raise TypeError("Input must be an integer.")
def incode(i=i, p=1, a=a):
# Here to protect p.
if i <= 61:
return lookup(i)
else:
pval = pow(62,p)
nval = i/pval
remainder = i % pval
if nval <= 61:
return lookup(nval) + incode(i % pval)
else:
return incode(i, p+1)
return incode()
def decode(s, a=alphabet):
'''Takes a base 62 string in our alphabet and returns it in base10.'''
try:
s = str(s)
except Exception:
raise TypeError("Input must be a string.")
return sum([lookup(i) * pow(62,p) for p,i in enumerate(list(reversed(s)))])a
Here's my version for whomever needs it.
Why not just translate your id to a string? You just need a function that maps a digit between, say, 0 and 61 to a single letter (upper/lower case) or digit. Then apply this to create, say, 4-letter codes, and you've got 14.7 million URLs covered.
Here is a decent URL encoding function for PHP...
// From http://snipplr.com/view/22246/base62-encode--decode/
private function base_encode($val, $base=62, $chars='0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ') {
$str = '';
do {
$i = fmod($val, $base);
$str = $chars[$i] . $str;
$val = ($val - $i) / $base;
} while($val > 0);
return $str;
}
Don't know if anyone will find this useful - it is more of a 'hack n slash' method, yet is simple and works nicely if you want only specific chars.
$dictionary = "abcdfghjklmnpqrstvwxyz23456789";
$dictionary = str_split($dictionary);
// Encode
$str_id = '';
$base = count($dictionary);
while($id > 0) {
$rem = $id % $base;
$id = ($id - $rem) / $base;
$str_id .= $dictionary[$rem];
}
// Decode
$id_ar = str_split($str_id);
$id = 0;
for($i = count($id_ar); $i > 0; $i--) {
$id += array_search($id_ar[$i-1], $dictionary) * pow($base, $i - 1);
}
Did you omit O, 0, and i on purpose?
I just created a PHP class based on Ryan's solution.
<?php
$shorty = new App_Shorty();
echo 'ID: ' . 1000;
echo '<br/> Short link: ' . $shorty->encode(1000);
echo '<br/> Decoded Short Link: ' . $shorty->decode($shorty->encode(1000));
/**
* A nice shorting class based on Ryan Charmley's suggestion see the link on Stack Overflow below.
* #author Svetoslav Marinov (Slavi) | http://WebWeb.ca
* #see http://stackoverflow.com/questions/742013/how-to-code-a-url-shortener/10386945#10386945
*/
class App_Shorty {
/**
* Explicitly omitted: i, o, 1, 0 because they are confusing. Also use only lowercase ... as
* dictating this over the phone might be tough.
* #var string
*/
private $dictionary = "abcdfghjklmnpqrstvwxyz23456789";
private $dictionary_array = array();
public function __construct() {
$this->dictionary_array = str_split($this->dictionary);
}
/**
* Gets ID and converts it into a string.
* #param int $id
*/
public function encode($id) {
$str_id = '';
$base = count($this->dictionary_array);
while ($id > 0) {
$rem = $id % $base;
$id = ($id - $rem) / $base;
$str_id .= $this->dictionary_array[$rem];
}
return $str_id;
}
/**
* Converts /abc into an integer ID
* #param string
* #return int $id
*/
public function decode($str_id) {
$id = 0;
$id_ar = str_split($str_id);
$base = count($this->dictionary_array);
for ($i = count($id_ar); $i > 0; $i--) {
$id += array_search($id_ar[$i - 1], $this->dictionary_array) * pow($base, $i - 1);
}
return $id;
}
}
?>
public class TinyUrl {
private final String characterMap = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
private final int charBase = characterMap.length();
public String covertToCharacter(int num){
StringBuilder sb = new StringBuilder();
while (num > 0){
sb.append(characterMap.charAt(num % charBase));
num /= charBase;
}
return sb.reverse().toString();
}
public int covertToInteger(String str){
int num = 0;
for(int i = 0 ; i< str.length(); i++)
num += characterMap.indexOf(str.charAt(i)) * Math.pow(charBase , (str.length() - (i + 1)));
return num;
}
}
class TinyUrlTest{
public static void main(String[] args) {
TinyUrl tinyUrl = new TinyUrl();
int num = 122312215;
String url = tinyUrl.covertToCharacter(num);
System.out.println("Tiny url: " + url);
System.out.println("Id: " + tinyUrl.covertToInteger(url));
}
}
This is what I use:
# Generate a [0-9a-zA-Z] string
ALPHABET = map(str,range(0, 10)) + map(chr, range(97, 123) + range(65, 91))
def encode_id(id_number, alphabet=ALPHABET):
"""Convert an integer to a string."""
if id_number == 0:
return alphabet[0]
alphabet_len = len(alphabet) # Cache
result = ''
while id_number > 0:
id_number, mod = divmod(id_number, alphabet_len)
result = alphabet[mod] + result
return result
def decode_id(id_string, alphabet=ALPHABET):
"""Convert a string to an integer."""
alphabet_len = len(alphabet) # Cache
return sum([alphabet.index(char) * pow(alphabet_len, power) for power, char in enumerate(reversed(id_string))])
It's very fast and can take long integers.
For a similar project, to get a new key, I make a wrapper function around a random string generator that calls the generator until I get a string that hasn't already been used in my hashtable. This method will slow down once your name space starts to get full, but as you have said, even with only 6 characters, you have plenty of namespace to work with.
I have a variant of the problem, in that I store web pages from many different authors and need to prevent discovery of pages by guesswork. So my short URLs add a couple of extra digits to the Base-62 string for the page number. These extra digits are generated from information in the page record itself and they ensure that only 1 in 3844 URLs are valid (assuming 2-digit Base-62). You can see an outline description at http://mgscan.com/MBWL.
Very good answer, I have created a Golang implementation of the bjf:
package bjf
import (
"math"
"strings"
"strconv"
)
const alphabet = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
func Encode(num string) string {
n, _ := strconv.ParseUint(num, 10, 64)
t := make([]byte, 0)
/* Special case */
if n == 0 {
return string(alphabet[0])
}
/* Map */
for n > 0 {
r := n % uint64(len(alphabet))
t = append(t, alphabet[r])
n = n / uint64(len(alphabet))
}
/* Reverse */
for i, j := 0, len(t) - 1; i < j; i, j = i + 1, j - 1 {
t[i], t[j] = t[j], t[i]
}
return string(t)
}
func Decode(token string) int {
r := int(0)
p := float64(len(token)) - 1
for i := 0; i < len(token); i++ {
r += strings.Index(alphabet, string(token[i])) * int(math.Pow(float64(len(alphabet)), p))
p--
}
return r
}
Hosted at github: https://github.com/xor-gate/go-bjf
Implementation in Scala:
class Encoder(alphabet: String) extends (Long => String) {
val Base = alphabet.size
override def apply(number: Long) = {
def encode(current: Long): List[Int] = {
if (current == 0) Nil
else (current % Base).toInt :: encode(current / Base)
}
encode(number).reverse
.map(current => alphabet.charAt(current)).mkString
}
}
class Decoder(alphabet: String) extends (String => Long) {
val Base = alphabet.size
override def apply(string: String) = {
def decode(current: Long, encodedPart: String): Long = {
if (encodedPart.size == 0) current
else decode(current * Base + alphabet.indexOf(encodedPart.head),encodedPart.tail)
}
decode(0,string)
}
}
Test example with Scala test:
import org.scalatest.{FlatSpec, Matchers}
class DecoderAndEncoderTest extends FlatSpec with Matchers {
val Alphabet = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
"A number with base 10" should "be correctly encoded into base 62 string" in {
val encoder = new Encoder(Alphabet)
encoder(127) should be ("cd")
encoder(543513414) should be ("KWGPy")
}
"A base 62 string" should "be correctly decoded into a number with base 10" in {
val decoder = new Decoder(Alphabet)
decoder("cd") should be (127)
decoder("KWGPy") should be (543513414)
}
}
Function based in Xeoncross Class
function shortly($input){
$dictionary = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z','0','1','2','3','4','5','6','7','8','9'];
if($input===0)
return $dictionary[0];
$base = count($dictionary);
if(is_numeric($input)){
$result = [];
while($input > 0){
$result[] = $dictionary[($input % $base)];
$input = floor($input / $base);
}
return join("", array_reverse($result));
}
$i = 0;
$input = str_split($input);
foreach($input as $char){
$pos = array_search($char, $dictionary);
$i = $i * $base + $pos;
}
return $i;
}
Here is a Node.js implementation that is likely to bit.ly. generate a highly random seven-character string.
It uses Node.js crypto to generate a highly random 25 charset rather than randomly selecting seven characters.
var crypto = require("crypto");
exports.shortURL = new function () {
this.getShortURL = function () {
var sURL = '',
_rand = crypto.randomBytes(25).toString('hex'),
_base = _rand.length;
for (var i = 0; i < 7; i++)
sURL += _rand.charAt(Math.floor(Math.random() * _rand.length));
return sURL;
};
}
My Python 3 version
base_list = list("0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ")
base = len(base_list)
def encode(num: int):
result = []
if num == 0:
result.append(base_list[0])
while num > 0:
result.append(base_list[num % base])
num //= base
print("".join(reversed(result)))
def decode(code: str):
num = 0
code_list = list(code)
for index, code in enumerate(reversed(code_list)):
num += base_list.index(code) * base ** index
print(num)
if __name__ == '__main__':
encode(341413134141)
decode("60FoItT")
For a quality Node.js / JavaScript solution, see the id-shortener module, which is thoroughly tested and has been used in production for months.
It provides an efficient id / URL shortener backed by pluggable storage defaulting to Redis, and you can even customize your short id character set and whether or not shortening is idempotent. This is an important distinction that not all URL shorteners take into account.
In relation to other answers here, this module implements the Marcel Jackwerth's excellent accepted answer above.
The core of the solution is provided by the following Redis Lua snippet:
local sequence = redis.call('incr', KEYS[1])
local chars = '0123456789ABCDEFGHJKLMNPQRSTUVWXYZ_abcdefghijkmnopqrstuvwxyz'
local remaining = sequence
local slug = ''
while (remaining > 0) do
local d = (remaining % 60)
local character = string.sub(chars, d + 1, d + 1)
slug = character .. slug
remaining = (remaining - d) / 60
end
redis.call('hset', KEYS[2], slug, ARGV[1])
return slug
Why not just generate a random string and append it to the base URL? This is a very simplified version of doing this in C#.
static string chars = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890";
static string baseUrl = "https://google.com/";
private static string RandomString(int length)
{
char[] s = new char[length];
Random rnd = new Random();
for (int x = 0; x < length; x++)
{
s[x] = chars[rnd.Next(chars.Length)];
}
Thread.Sleep(10);
return new String(s);
}
Then just add the append the random string to the baseURL:
string tinyURL = baseUrl + RandomString(5);
Remember this is a very simplified version of doing this and it's possible the RandomString method could create duplicate strings. In production you would want to take in account for duplicate strings to ensure you will always have a unique URL. I have some code that takes account for duplicate strings by querying a database table I could share if anyone is interested.
This is my initial thoughts, and more thinking can be done, or some simulation can be made to see if it works well or any improvement is needed:
My answer is to remember the long URL in the database, and use the ID 0 to 9999999999999999 (or however large the number is needed).
But the ID 0 to 9999999999999999 can be an issue, because
it can be shorter if we use hexadecimal, or even base62 or base64. (base64 just like YouTube using A-Z a-z 0-9 _ and -)
if it increases from 0 to 9999999999999999 uniformly, then hackers can visit them in that order and know what URLs people are sending each other, so it can be a privacy issue
We can do this:
have one server allocate 0 to 999 to one server, Server A, so now Server A has 1000 of such IDs. So if there are 20 or 200 servers constantly wanting new IDs, it doesn't have to keep asking for each new ID, but rather asking once for 1000 IDs
for the ID 1, for example, reverse the bits. So 000...00000001 becomes 10000...000, so that when converted to base64, it will be non-uniformly increasing IDs each time.
use XOR to flip the bits for the final IDs. For example, XOR with 0xD5AA96...2373 (like a secret key), and the some bits will be flipped. (whenever the secret key has the 1 bit on, it will flip the bit of the ID). This will make the IDs even harder to guess and appear more random
Following this scheme, the single server that allocates the IDs can form the IDs, and so can the 20 or 200 servers requesting the allocation of IDs. The allocating server has to use a lock / semaphore to prevent two requesting servers from getting the same batch (or if it is accepting one connection at a time, this already solves the problem). So we don't want the line (queue) to be too long for waiting to get an allocation. So that's why allocating 1000 or 10000 at a time can solve the issue.

Join a string using delimiters

What is the best way to join a list of strings into a combined delimited string. I'm mainly concerned about when to stop adding the delimiter. I'll use C# for my examples but I would like this to be language agnostic.
EDIT: I have not used StringBuilder to make the code slightly simpler.
Use a For Loop
for(int i=0; i < list.Length; i++)
{
result += list[i];
if(i != list.Length - 1)
result += delimiter;
}
Use a For Loop setting the first item previously
result = list[0];
for(int i = 1; i < list.Length; i++)
result += delimiter + list[i];
These won't work for an IEnumerable where you don't know the length of the list beforehand so
Using a foreach loop
bool first = true;
foreach(string item in list)
{
if(!first)
result += delimiter;
result += item;
first = false;
}
Variation on a foreach loop
From Jon's solution
StringBuilder builder = new StringBuilder();
string delimiter = "";
foreach (string item in list)
{
builder.Append(delimiter);
builder.Append(item);
delimiter = ",";
}
return builder.ToString();
Using an Iterator
Again from Jon
using (IEnumerator<string> iterator = list.GetEnumerator())
{
if (!iterator.MoveNext())
return "";
StringBuilder builder = new StringBuilder(iterator.Current);
while (iterator.MoveNext())
{
builder.Append(delimiter);
builder.Append(iterator.Current);
}
return builder.ToString();
}
What other algorithms are there?
It's impossible to give a truly language-agnostic answer here as different languages and platforms handle strings differently, and provide different levels of built-in support for joining lists of strings. You could take pretty much identical code in two different languages, and it would be great in one and awful in another.
In C#, you could use:
StringBuilder builder = new StringBuilder();
string delimiter = "";
foreach (string item in list)
{
builder.Append(delimiter);
builder.Append(item);
delimiter = ",";
}
return builder.ToString();
This will prepend a comma on all but the first item. Similar code would be good in Java too.
EDIT: Here's an alternative, a bit like Ian's later answer but working on a general IEnumerable<string>.
// Change to IEnumerator for the non-generic IEnumerable
using (IEnumerator<string> iterator = list.GetEnumerator())
{
if (!iterator.MoveNext())
{
return "";
}
StringBuilder builder = new StringBuilder(iterator.Current);
while (iterator.MoveNext())
{
builder.Append(delimiter);
builder.Append(iterator.Current);
}
return builder.ToString();
}
EDIT nearly 5 years after the original answer...
In .NET 4, string.Join was overloaded pretty significantly. There's an overload taking IEnumerable<T> which automatically calls ToString, and there's an overload for IEnumerable<string>. So you don't need the code above any more... for .NET, anyway.
In .NET, you can use the String.Join method:
string concatenated = String.Join(",", list.ToArray());
Using .NET Reflector, we can find out how it does it:
public static unsafe string Join(string separator, string[] value, int startIndex, int count)
{
if (separator == null)
{
separator = Empty;
}
if (value == null)
{
throw new ArgumentNullException("value");
}
if (startIndex < 0)
{
throw new ArgumentOutOfRangeException("startIndex", Environment.GetResourceString("ArgumentOutOfRange_StartIndex"));
}
if (count < 0)
{
throw new ArgumentOutOfRangeException("count", Environment.GetResourceString("ArgumentOutOfRange_NegativeCount"));
}
if (startIndex > (value.Length - count))
{
throw new ArgumentOutOfRangeException("startIndex", Environment.GetResourceString("ArgumentOutOfRange_IndexCountBuffer"));
}
if (count == 0)
{
return Empty;
}
int length = 0;
int num2 = (startIndex + count) - 1;
for (int i = startIndex; i <= num2; i++)
{
if (value[i] != null)
{
length += value[i].Length;
}
}
length += (count - 1) * separator.Length;
if ((length < 0) || ((length + 1) < 0))
{
throw new OutOfMemoryException();
}
if (length == 0)
{
return Empty;
}
string str = FastAllocateString(length);
fixed (char* chRef = &str.m_firstChar)
{
UnSafeCharBuffer buffer = new UnSafeCharBuffer(chRef, length);
buffer.AppendString(value[startIndex]);
for (int j = startIndex + 1; j <= num2; j++)
{
buffer.AppendString(separator);
buffer.AppendString(value[j]);
}
}
return str;
}
There's little reason to make it language-agnostic when some languages provide support for this in one line, e.g., Python's
",".join(sequence)
See the join documentation for more info.
For python be sure you have a list of strings, else ','.join(x) will fail.
For a safe method using 2.5+
delimiter = '","'
delimiter.join(str(a) if a else '' for a in list_object)
The "str(a) if a else ''" is good for None types otherwise str() ends up making then 'None' which isn't nice ;)
In PHP's implode():
$string = implode($delim, $array);
I'd always add the delimeter and then remove it at the end if necessary. This way, you're not executing an if statement for every iteration of the loop when you only care about doing the work once.
StringBuilder sb = new StringBuilder();
foreach(string item in list){
sb.Append(item);
sb.Append(delimeter);
}
if (list.Count > 0) {
sb.Remove(sb.Length - delimter.Length, delimeter.Length)
}
I would express this recursively.
Check if the number of string arguments is 1. If it is, return it.
Otherwise recurse, but combine the first two arguments with the delimiter between them.
Example in Common Lisp:
(defun join (delimiter &rest strings)
(if (null (rest strings))
(first strings)
(apply #'join
delimiter
(concatenate 'string
(first strings)
delimiter
(second strings))
(cddr strings))))
The more idiomatic way is to use reduce, but this expands to almost exactly the same instructions as the above:
(defun join (delimiter &rest strings)
(reduce (lambda (a b)
(concatenate 'string a delimiter b))
strings))
List<string> aaa = new List<string>{ "aaa", "bbb", "ccc" };
string mm = ";";
return aaa.Aggregate((a, b) => a + mm + b);
and you get
aaa;bbb;ccc
lambda is pretty handy
In C# you can just use String.Join(separator,string_list)
The problem is that computer languages rarely have string booleans, that is, methods that are of type string that do anything useful. SQL Server at least has is[not]null and nullif, which when combined solve the delimiter problem, by the way: isnotnull(nullif(columnvalue, ""),"," + columnvalue))
The problem is that in languages there are booleans, and there are strings, and never the twain shall meet except in ugly coding forms, e.g.
concatstring = string1 + "," + string2;
if (fubar)
concatstring += string3
concatstring += string4 etc
I've tried mightily to avoid all this ugliness, playing comma games and concatenating with joins, but I'm still left with some of it, including SQL Server errors when I've missed one of the commas and a variable is empty.
Jonathan
Since you tagged this language agnostic,
This is how you would do it in python
# delimiter can be multichar like "| trlalala |"
delimiter = ";"
# sequence can be any list, or iterator/generator that returns list of strings
result = delimiter.join(sequence)
#result will NOT have ending delimiter
Edit: I see I got beat to the answer by several people. Sorry for dupication
I thint the best way to do something like that is (I'll use pseudo-code, so we'll make it truly language agnostic):
function concat(<array> list, <boolean> strict):
for i in list:
if the length of i is zero and strict is false:
continue;
if i is not the first element:
result = result + separator;
result = result + i;
return result;
the second argument to concat(), strict, is a flag to know if eventual empty strings have to be considered in concatenation or not.
I'm used to not consider appending a final separator; on the other hand, if strict is false the resulting string could be free of stuff like "A,B,,,F", provided the separator is a comma, but would instead present as "A,B,F".
that's how python solves the problem:
','.join(list_of_strings)
I've never could understand the need for 'algorithms' in trivial cases though
This is a Working solution in C#, in Java, you can use similar for each on iterator.
string result = string.Empty;
// use stringbuilder at some stage.
foreach (string item in list)
result += "," + item ;
result = result.Substring(1);
// output: "item,item,item"
If using .NET, you might want to use extension method so that you can do
list.ToString(",")
For details, check out Separator Delimited ToString for Array, List, Dictionary, Generic IEnumerable
// contains extension methods, it must be a static class.
public static class ExtensionMethod
{
// apply this extension to any generic IEnumerable object.
public static string ToString<T>(this IEnumerable<T> source,
string separator)
{
if (source == null)
throw new ArgumentException("source can not be null.");
if (string.IsNullOrEmpty(separator))
throw new ArgumentException("separator can not be null or empty.");
// A LINQ query to call ToString on each elements
// and constructs a string array.
string[] array =
(from s in source
select s.ToString()
).ToArray();
// utilise builtin string.Join to concate elements with
// customizable separator.
return string.Join(separator, array);
}
}
EDIT:For performance reasons, replace the concatenation code with string builder solution that mentioned within this thread.
Seen the Python answer like 3 times, but no Ruby?!?!?
the first part of the code declares a new array. Then you can just call the .join() method and pass the delimiter and it will return a string with the delimiter in the middle. I believe the join method calls the .to_s method on each item before it concatenates.
["ID", "Description", "Active"].join(",")
>> "ID, Description, Active"
this can be very useful when combining meta-programming with with database interaction.
does anyone know if c# has something similar to this syntax sugar?
In Java 8 we can use:
List<String> list = Arrays.asList(new String[] { "a", "b", "c" });
System.out.println(String.join(",", list)); //Output: a,b,c
To have a prefix and suffix we can do
StringJoiner joiner = new StringJoiner(",", "{", "}");
list.forEach(x -> joiner.add(x));
System.out.println(joiner.toString()); //Output: {a,b,c}
Prior to Java 8 you can do like Jon's answer
StringBuilder sb = new StringBuilder(prefix);
boolean and = false;
for (E e : iterable) {
if (and) {
sb.append(delimiter);
}
sb.append(e);
and = true;
}
sb.append(suffix);
In .NET, I would use the String.join method if possible, which allows you to specify a separator and a string array. A list can be converted to an array with ToArray, but I don't know what the performance hit of that would be.
The three algorithms that you mention are what I would use (I like the second because it does not have an if statement in it, but if the length is not known I would use the third because it does not duplicate the code). The second will only work if the list is not empty, so that might take another if statement.
A fourth variant might be to put a seperator in front of every element that is concatenated and then remove the first separator from the result.
If you do concatenate strings in a loop, note that for non trivial cases the use of a stringbuilder will vastly outperform repeated string concatenations.
You could write your own method AppendTostring(string, delimiter) that appends the delimiter if and only if the string is not empty. Then you just call that method in any loop without having to worry when to append and when not to append.
Edit: better yet of course to use some kind of StringBuffer in the method if available.
string result = "";
foreach(string item in list)
{
result += delimiter + item;
}
result = result.Substring(1);
Edit: Of course, you wouldn't use this or any one of your algorithms to concatenate strings. With C#/.NET, you'd probably use a StringBuilder:
StringBuilder sb = new StringBuilder();
foreach(string item in list)
{
sb.Append(delimiter);
sb.Append(item);
}
string result = sb.ToString(1, sb.Length-1);
And a variation of this solution:
StringBuilder sb = new StringBuilder(list[0]);
for (int i=1; i<list.Count; i++)
{
sb.Append(delimiter);
sb.Append(list[i]);
}
string result = sb.ToString();
Both solutions do not include any error checks.
From http://dogsblog.softwarehouse.co.zw/post/2009/02/11/IEnumerable-to-Comma-Separated-List-(and-more).aspx
A pet hate of mine when developing is making a list of comma separated ids, it is SO simple but always has ugly code.... Common solutions are to loop through and put a comma after each item then remove the last character, or to have an if statement to check if you at the begining or end of the list. Below is a solution you can use on any IEnumberable ie a List, Array etc. It is also the most efficient way I can think of doing it as it relies on assignment which is better than editing a string or using an if.
public static class StringExtensions
{
public static string Splice<T>(IEnumerable<T> args, string delimiter)
{
StringBuilder sb = new StringBuilder();
string d = "";
foreach (T t in args)
{
sb.Append(d);
sb.Append(t.ToString());
d = delimiter;
}
return sb.ToString();
}
}
Now it can be used with any IEnumerable eg.
StringExtensions.Splice(billingTransactions.Select(t => t.id), ",")
to give us 31,32,35
For java a very complete answer has been given in this question or this question.
That is use StringUtils.join in Apache Commons
String result = StringUtils.join(list, ", ");
In Clojure, you could just use clojure.contrib.str-utils/str-join:
(str-join ", " list)
But for the actual algorithm:
(reduce (fn [res cur] (str res ", " cur)) list)
Groovy also has a String Object.join(String) method.
Java (from Jon's solution):
StringBuilder sb = new StringBuilder();
String delimiter = "";
for (String item : items) {
sb.append(delimiter).append(item);
delimeter = ", ";
}
return sb.toString();
Here is my humble try;
public static string JoinWithDelimiter(List<string> words, string delimiter){
string joinedString = "";
if (words.Count() > 0)
{
joinedString = words[0] + delimiter;
for (var i = 0; i < words.Count(); i++){
if (i > 0 && i < words.Count()){
if (joinedString.Length > 0)
{
joinedString += delimiter + words[i] + delimiter;
} else {
joinedString += words[i] + delimiter;
}
}
}
}
return joinedString;
}
Usage;
List<string> words = new List<string>(){"my", "name", "is", "Hari"};
Console.WriteLine(JoinWithDelimiter(words, " "));

Resources