Is it possible to make these regex shorter? [closed] - ruby

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 10 years ago.
As the topic suggest, is it possible to make these regex shorter? I am are using Ruby 1.9.3
/\n\s+(\w{0,3})[\s&&[^\n]\S]+?([\d\.]+)[\S\s&&[^\n]]+?([\d\.]+)/
and this
/\s+(\w+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+-*\s+(\d+)\s+(\d+)\s+/
Thanks!

/\n\s+(\w{0,3})[\s&&[^\n]\S]+?([\d\.]+)[\S\s&&[^\n]]+?([\d\.]+)/
If I understand ruby regular expressions correctly, [\s&&[^\n]\S] means that a character should be a whitespace character AND either a non-whitespace character or not a newline. As a character cannot be both a whitespace and non-whitespace character, you could probably shorten it to [\s&&[^\n]].
You could also remove the parentheses, (\w{0,3}) becomes \w{0,3}, but if you are trying to use the characters in those groups later on in your code, then you shouldn't.
/\s+(\w+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+-*\s+(\d+)\s+(\d+)\s+/
You could combine some of your statements, \s+\w+(\s+\d+){5}\s+-*(\s+\d+){2}\s+, but again this would cause headaches if your code actually uses those groups to extract information.

Are you essentially aiming to split a fixed-width-column web page?
Regexp is one way. You may be interested in a fixed-width-column approach:
uri = URI.parse 'http://www.ida.liu.se/~TDP007/material/seminarie2/weather.txt'
page = uri.read
rows = page.split(/\n/)[9..-3]
rows.each{|r|
day, max, mnt = r[0..3].strip, r[4..11].strip, r[12..17].strip
}

The following might not be shorter (if you count the number of characters needed to type it), but it is a lot more readable:
arr = ['(\w+)'] # Match a word
arr += ['(\d+)']*5 # Match five numbers
arr += ['-*'] # ignore dashes
arr += ['(\d+)']*2 # Match two numbers
# All of the above separated with space, plus space before and after.
my_regexp = Regexp.new(([''] + arr + ['']).join('\s+'))

If that is the only file that you need to process then you can remove unnecessary data by hand, then read the file line by line, split by space characters \s+ and pick out the columns.
Even without removing unnecessary data by hand, you can also read the original file line by line, split by \s+, and test whether the first few entries are numbers. This is exactly what you are doing with regex also (test format and extract data matching the format).
Note that [\s&&[^\n]\S] means intersecting \s and [^\n]\S, which results in the set: all space characters but new line. So we can rewrite it to [\s&&[^\n]]. However, [\S\s&&[^\n]] means intersecting \S\s and [^\n], which results in the set: all characters but new line. The equivalent rewrite is . or [^\n], but I doubt this is what you mean. The result will still be correct for the current input due to the lazy quantifier, but it might not for bad input.
Another thing is . will mean literal . inside character class, so [\d.] is equivalent to [\d\.].

Related

Ruby: extract substring between 2nd and 3rd fullstops [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
i am constructing a program in Ruby which requires the value to be extracted between the 2nd and 3rd full-stop in a string.
I have searched online for various related solutions, including truncation and this prior Stack-Overflow question: Get value between 2nd and 3rd comma, however no answer illustrated a solution in the Ruby language.
Thanks in Advance.
list = my_string.split(".")
list[2]
That will do it I think. First command splits it into a list. Second gets the bit you want
You could split the string on full stops (aka periods), but that creates an array with one element for each substring preceding a full stop. If the document had, say, one million such substrings, that would be a rather inefficient way of getting just the third one.
Suppose the string is:
mystring =<<_
Now is the time
for all Rubiests
to come to the
aid of their
bowling team.
Or their frisbee
team. Or their
air guitar team.
Or maybe something
else...
_
Here are a couple of approaches you could take.
#1 Use a regular expression
r = /
(?: # start a non-capture group
.*?\. # match any character any number of times, lazily, followed by a full stop
){2} # end non-capture group and perform operation twice
\K # forget everything matched before
[^.]* # match everything up to the next full stop
/xm # extended/free-spacing regex definition mode and multiline mode
mystring[r]
#=> " Or their\nair guitar team"
You could of course write the regex:
r = /(?:.*?\.){2}\K[^.]*/m
but the extended form makes it self-documenting.
The regex engine will step through the string until it finds a match or concludes that there can be no match, and stop there.
#2 Pretend a full stop is a newline
First suppose we were looking for the third line, rather than the third substring followed by a full stop. We could write:
mystring.each_line.take(3).last.chomp
# => "to come to the"
Enumerable#take determines when a line ends by examining the input record separator, which is held by the global variable $/. By default, $/ equals a newline. We therefore could do this:
irs = $/ # save old value, normally \n
$/ = '.'
mystring.each_line.take(3).last[0..-2]
#=> " Or their\nair guitar team"
Then leave no footprints:
$/ = irs
Here String#each_line returns an enumerator (in effect, a rule for determining a sequence of values), not an array.

Slow Ruby Regex Becomes Fast with Odd Change

I've been debugging a site to find the source of long page loading times, and I've narrowed it down to a regex that's used to extract URLs from text:
/(?:([\w+.-]+):\/\/|(?:www\.))[^\s<]+/g
This takes about 3 seconds to run on a large block of text. I found out that if I add the inverse of the first clause to the start of the regex ((?:[^\w+.-]|^)), it runs almost instantly:
/(?:[^\w+.-]|^)(?:([\w+.-]++):\/\/|(?:www\.))[^\s<]+/gx
It seems to me like the added clause shouldn't affect the regex at all, since nothing could cause that clause to fail (as those characters would be matched by the "[\w+.-]++" clause). Why does this make the regex run so much faster?
Edit
Some people have asked for an example of what I'm trying to do. To simplify things and to address the concerns people had in the comments, I'll be using the following two regexes:
# slow one
/(?:([\w+.-]+):\/\/|(?:www\.))[^\s<]+/g
# fast one
/[^\w+.-](?:([\w+.-]+):\/\/|(?:www\.))[^\s<]+/g
Fire up IRB/Pry and throw some text in a variable (this is a scrubbed version of what is actually searched against):
text = <<END_OF_TEXT
Unable to deliver message to email#example.com. Error message: request: <soap:Envelope xmlns:soap=";http://schemas.xmlsoap.org/soap/envelope/" xmlns:t=";http://schemas.microsoft.com/exchange/services/year/types" xmlns:m=";http://schemas.microsoft.com/exchange/services/year/messages"><soap:Header><t:RequestServerVersion Version="ExchangeYear"/></soap:Header><soap:Body><m:CreateItem MessageDisposition="SendAndSaveCopy"><m:SavedItemFolderId><t:DistinguishedFolderId Id="stuff"/></m:SavedItemFolderId><m:Items><t:Message><t:MimeContent>RGF0ZTogRnJpLCAwMyBBcHIgMjAxNSAxNDo0MzozMCArMDMwMA0KRnJvbTogPT91dGYtOD9RPz1EMD05ND1EMD1BMT1EMD1BMl89M4j5Ba0fQrvz8atXqDIHQS4xT5dBOrGbeSsUfHFTfj6eP8blEZKl16Pgp4iA0AcFvJtCeC6s3Iq5GJbVXivtrpyKa5n3yB0f6xUQdFc95hTUleo12k0MH3rRwi0RX8wxyRaWUH81yjiXRmcjeTWAhtOCoDVb7oxOmIZAXNVQAMh05JitBFSUwVvZQuXPOo7BfGsIog4rjacpj743JsmuHxuYSl7WZQj7hV9wxvVSpE78Ey6uLAAL3yCBQ41EOSG5alLJeUOb7LVTlPQL6cauwRDUERZ5UYlJzYXj26hfrpzIVL15RlzQyLwt0cFFLcOsKNBwVXoyRyB784mqhJ7Ks9pFngzk9GZQ23M9ivSD5tDvc2083K7DPgfNThy9ev64jKZ7Ktex2ljsBovyDK9zr9RLWTViuoRjpltzZ8efRu6cBMppofm5DxbQbowvP5nRXSdS9ay9gfZ6Z2Sl3mO5W4LQh6xOE2uCqLNCeQSVWqUzf7dHyLp1RE6br76Rok1rhE8xi7NomNWViQb3ZA45gbUsY0UqvhrsgVfGZ5z5XuyzezQ9u8sHxSOGoK4XgfoZbOboOgxtB07JNx0CtGupENtOCH3Et4lGoNmJ1Mb7DaUAVNEf3m90bHm1M2d8QX2Bik6fvk9TguIKWH6syPFKJMKQrR5zIouNEyqqERYjZIJtRl6AQ4DZQd2iKt1ENZm8XZJXQhtNoE9h3DhWLDsN26cipUd6Y0abkZQX9ObR4J8ULscFmZvkh0MDS0Grpx8zwxn0Mg9bDGqbbJ97iGCb0DkhiU1TsqEZSfZzRB9c8PRba8QrlZ4FapbE1tRswDe2MPSEjJCLUJIdnHDNmBfA807NMVfwYS8FF5fG63XIruKACsXjxiKq8KF4ciXpEuM2jJcp2fCqOD0t4OUXCnGvodu7for9xKA8JWU9tJYZEAnNUKrWK8yh18pXltvElyVRNfnYXkaqFWA7Py6AmdoCoKPL3zBJh9AkoK8QpZfRgQeG5XQ4WzQlG2Zhsz7lcA9v3uyv2g4kh8HAzzZ1c6fyKP4M32yS09tN2N91aMyWTvBEic60FHZbZsnUowoSgi1kBeSE4IV3BWp6wly2Z149sUshIuBxC3IdVGA2cHHpk0aQgdIPJkqniWATqiMpNhsTtvqZAxlhTY4ELxJWuOBgfEkh5nGG3MfRcGirPRqLLFGiOn0i9HnUzsKZ4hc9jNVKUQfrQgUEbH7Y6Zck7gCgxP72JgBulHZYUPwJMWSDdlUl2LYROVFbGphx6CwCCKG0yiD1ImNu1JLr4J9fZRFLHlaDorSNCCNf6ERCBUlOIVYe2NLxQrUwMFnE6WDuNJI8WTO2jhm5PugypaQHBuUwcIvz5bhQD3PRcmJJieDTw4tbakH8NKl2KYJ63Yvsi2TP1AEuCm6EW5AVlxaDr2HqEjUdisK1TuOv2Y3vOpHjxwQGwnDAi61K2leqkNkCB4La7y2OHQVrVcEjALKF9cdR0tcfrTWXqoTPWevti8QPxkZDdBlImvMLaN1JH1XVv0XQoE5Qbvvv4RyxhGmFQV36VsdwE1s794QHwUl3dBxIZAYe5jqXc6VK90wRgvU8rDByY5WB3F0s2Bi7UeuLZnzHddTTBMiMU0yTXCQTBfx2OKM5PoIZdDbCCALMGbMwFRldlt8Z61vZEKZDjiJOkXCphAyNoY1K0vZJIGVDZjzNPRMy9fvQvOliREksdaKGYhJrpUsXGsDJLv5w7DcFygeOZPAkuacXCtLwYExbRd9SUqgOsCKRUbw7ZA4ysEVL1D8rpRbJeKKYKYIyOd9W2BEf5uGQ10YRRSzl8l87n28XApT5d8vYIznXp1zxs2jU0uQKd0NgvslVVUGRo3o17xGOnNBJM5QvyYnEG0FZnhXbIExq2zr0H1Km0OgEU1W7pFYyUE0spZwVtTazYD6TZPXUfj4jYS2vgnGRnHYfVAEH6Ufyl7VfcGbqsF8C5wTBLCSB854DBOI65NWEfJPmypwTUU5dkpk34DY9PEtb0w1LnLPAYlG35wlKPx63QExGVJgBu9rPRYX33NH8lPVeuHIXxdCZdvthOly2hVs6PsDU5WJxyIhIhvIauwY7nBx2BkGNfpoDX7Qb21MOUwWXGmwGBXIThZpP5oiUJTkVq1p3QsMl9cIlFzmvJvGMFHIca4ZZGZq4ukbFXOZJkJ5UWP0CYccv0HQPZjhpDJfnJNSctArpNmYGDAlFgkgrb24deZj0pNE1qfByF1H092bb9ZNIiBwp3d3av1S2thX6sKycCWibrF0MhqaEPGP016E018I4GUCk4IkuQ32ZMEAwzLVpjd9eT0pWMC6eeGy7EVobYs9OdZazxJQUOYc2JXGbdBJ1WK0BNrFANa3MzPUZBpw7tG98Bz3htPIAYRURwPqWFMAfwz0TvJZrLrVQUoQFBDl2mmdI1Fw0CoBnhC6w7XtjNKxfYWroixWIM6knPS1XcEDZoxSEUQGoUIzO2fK3RU2z0h4dbqUNfXCM7tLmIj6xqLlpbWZTy8XEAWq0bmABYy1VFv90yF0HeEF4laiKmYhQ5TyMJzS4XX77xIrLQNzDO03zWr1JMyucczShje0tZptKMGawoqWUyPFmbgzxTyjWjiLSWO6vxFl27SKbiERw74bqnjZ0hUWXsHHAnlCwl6nCfFvY52q6skalWZGBdiG7O4NbG8WUZxpcvtEu7xI9qYkECV8ThoicSudB0LV7T5VUKZvAwsOFcQ6L5kGJGxCS4dIwARllkvXmdKUyAYXGxk7IPoSJWcb4kjoxrlhoIvs8TlM4LfKBBYjOkfuz5XKCNEWnfDGDeWkjVagatxpbiDmDl8OhHppg6VB7LapvLIHolfqEmgAelTjhrTNBMHVypFk1945p9gDH6szpHkW5AADAy5o0rmGlUisiOG9zzEaFqMhCYIhW7Iqjd1SxBRrCPXxpe08kH6cYKObmHo1cerTBVml5BMUbmaxVG3X6VVKjFRpZD6mHOiFXvt0zwr69RntM9YvaUk31cIXOjF8rZ0VAxsnvYXD1QWtjhtY55L1DEzcSFhjiQChmBK4vUEaWksCbxYCeN8lVvCGaZ7wxHRNMdwrG7d0QHrw0C39LPZHGSjKYzbgx1JpM88dmQu2yNkrOuXJUYSJS6LBYGrIHCzJzebmmAMVjWIiCb6B5jccssUgn0Yl14PaP3LxG8QBsRlWjxfyz9b3aP4nvg4Db39hIB9cBRoqdooaa9FAv9HiVPEfiIA2N6q2DzaJFQglEcLVWXOVTuT3EAmc0Q2OqJE6WNlHFwVPXylalY4IDjEquQH4TZLjFwrUcNx3lTgSymASFmrUipau4QkL1PCrXY3AfqusoU1WRczs2ZRZhBkTu6SUGyQuyohomIXvD6AjRx6eKQwzd4VdfauUads6v7CsryyU2tJim4IQge9wLjJd5HbD2BeRjbUH918fL4BreR7Kv2wMNXydJkS7URtW3CH3VImZqUMwJCsGtD158GcDrST7A81MdjezzrBFpH43eQIXfVnKYiv7zBHLvtaJ2NaoSGC7FhzfpCjsbZFMlZkiaQykCZp5pUfH1izWvC5fpJ9Cwkc4lEMcKGLPpFq6vugPYpktDLNaAvyDZfVK8XixTzpR3DAQ5ehWsd98lAmUElUu4SlB7EuB2QglcstnnXZRCqA2jfYpGoKsMPRIYOswHEqJbiBQdvr6i8LNcyhlCQEcqufQJDSampQQTFTYlqn4XI80n9h2U3eAS5xIqCELMJaBKJIjDFBH8bTXB1RzGUL3OnIZ39tkJH1Uf2ygSlJKHTlpCSgcLxCnSFoYKzNEybkHIW6teq3lGtE0Asdsbb7ys3ictlJ7omJET0iJrSoZf9S4KCJUgBbRn3FDHTp75XWofO7kw6WpYHl3LZOAOvVabmUu9jAdtxwbsx0y5SnmtFsp40cM3uWAsIM7j6Nx3144hgKQD72VuHpeFCCj2CrtERhVxkDyAV2EmroGmhom3rlkdeyjS44eee36KR5xB6q4nxl7qdXvlchQKoWb24Q5DnDzdmIGWh6EeDHse8MLlvtlKZhsRAH3TiTdqKHN1uLDCvS9PUz7i7PzbmFC3sWVxeG2FtRzQEw6BzOlQizXBClak69r5oJ5t7Hcx1Rlv9ktHX3A1rwM8S3eTmzGuZIxXSCZRddtrbdMxAyiYuH7lhJRGbC3x5AZgcqgYHSOJa9mmh5ouVgMoxj7Pt5qHSlIJ8E4YWnLGNmQyRVou06G6Dtu3EYmtUcZ1VgnMeCdwkBBPe0bkFAbCiTRfYTgC3oyftSxW4HF5WhN5yORlTfg4rtrWOC5rWsJZiuZSvV0l9IwdkPA04n1f0ryz0Eo7EtHf5EQEIHc66340SBjKQrChypJIw77QRCBvT8bJHRHjnZytfj1clmK20A1cAJaEncUrm0FusmiQSAkZgO86abE1tmgGh27p0O71EpQ1jo13MONtsltNrsec99X2qKWec3BqalQZzss93iJYgRzKbOgf4xkOCT9xM44p55JtFA3n4AXI12yJvplL6QixN045qeqC6ssMhJE2Y8H2EhHRIYBfNpLZ3xzEXH6aeywGF3optfPN5STtwlPXFdwlMwMXgp7Av5zuhaE7SBjgb5yL9FsX1XyffxeTtXQp00Er5JJvOkeP6H1N97GX0oOj1pfC2mexPVbbgDmoY6gyTx3fw2kYFqS0laELEtium2Jy5G1QIyisvaXAUjkLS3zo1wnJPzM48VApY2OqTsZlwWZAZgk8BKMh4U7XKij7cRWJsnKAR0yIFmMWwkUmao0X8AMc4Ki114cNx1HIeGuvpIZvedCHazFBpCzwjg6DZXHFYCB0yMGILd7R5D7LHcaVtlCq4ezPCULVpD2ZsB7bSnnTj8boYmzAWaoTJwUM9h1sEuiP4klUAj4YjHj0gdBYSgGInkMC9v64iago5lxNHKsF1rIy5tWvXmEbRbcbTPmYfuy3z101KCzAfGZkWNVbWVnV0gAbMf4405gU1HYDgJFXWZgaeHdRYyhcEXkDwEBHetGF97e9GcbvMgV35n9kmWUmtAEVAmk3mNEP1dPBiuQmKVQdaJem6aOMskQEMfC8rH9ILNshh9wYqLCtpvSFlrUODHdE1dkJJUdZVfF9DJZwOZx47VFfIak4NZSaP9iGT56aY6OEqKGC4zqxDxMVWHhdE1QWpGjcEoZj0Tfn1cCZ6iTl4UHqDWK3o0MDuS16OkU0tTn7qYuBdokaVAWcdTFtoWdLeFK87ZUjax99jFeBKNEdubnWHohy4kRovMIcRZ2tq55Ix5Dmakg1juNBBlyCJ1QccOjPWIXgzsnbcgTKYuh2EwkWDoaFD3WjZMPkMLQZxr5L7z1OBDtcOfwmxp6MOGemt9lBGpDwK5LZlmpHB0jF8q1Iovdfbv3Nkg44AAjqV3oqHE41QBd34xUVlNVpNzDAyeU9reNmTE7y6A2GqUNPjerMPFp2Rj6ksqDs8JMuz4pSX4r72NlNGyOcE0dRLysM2AQwV1hI7xNPc6Jx60tqIaDeStJ9f8rLb0TjymgmsDKOFIT2SQK6R1nbORU2vhgyunLpnaw9oX41qYNYV8cttRNqtTwmlBf6Bt6jv9rJI2HBajBuBwaDpuSrsLEoWARJMImuo4djeKMm8tJHv3aj4k9qe68umdbSxSTBpffb1846CbNWnvUIki4VDEYP85c9CrRat6sk2oWH1w5NGMe5AYsR5schIVEIpf4OfHpaWHPT316ip1r3dgsKD95KZ25FzmPdtRZajIL25f5ZMzgpXZliwrThurFBDPRZVpiiZB1uEPWIktHU3e8u1B7Ug8qg3IQkEeXk73ir1Npe2ZpI4JYKkNkksrQjjchMJOWLkHzHoAQfuXhPfdVVUDfvVteBgXV2KB2XJKe73pSPqIeQhL8ozu7VjpBr9qoXW2UhtWzTCSDUjokLNta7KnEmqEZbcVyaQMfS0cb0GpM1B0JGXimkCiiQm22P4Hj4dsVJenXUnjqpLHBvFFgRspNqsMLaIqBdE1dLqcREtGPhy3dFta6OppJ8q4IhmsEqkLLu43LMzDH1p9e1xatH5iSiUNn8S1jhEIKEe1LGAO5VcYvds0WcJS0JchSDLvlIwQsb5xTAvocdWFICvIfCWpnmCkoCKl777LZ74G78STTjPubAaME3x9cliLzMIVXuGqZZ8zwpM2D71TIAgBy49IFvqrTMm7Rw50I5hZOwEMmCna4lzEKAdIMYiSrD0XHOA9XdGj6xj91zch0qR05HRLfJeIZzuiLR80B7QjxI84OgD7nLZrVuWxjVoOCvx8sR8vQcGClXUmMpB3RZfamBEswBsvaUKVPiHDTp0QpMKUtMeX9LzNLSrq7WgNugolPz2MpsGSNKYKREvhvTFAcxBc7ZjpflosHH4OvaQ0DUzgxp3mbZpY7eeHLeLT2DMODiYBD8QPaLusJXTvRovJviw8v0DG7A7s0qTfhidyiFVoWJvnZ4n3QSOh79XDlx5fAL1oZjQdEHEOpzGtuhXqCnJhvhhPN3vybKfyIvyJzIOu0NNSPe7P8jFOyLzOKiHMMSR0QG4vZhp3winzD6yCuq8tFo5p0jktwjvArc1OME09KdXyEgIY1JNANsHJiSTmnvRkXg0UyoZX48SdjAnDxlKLzRfT5128hIMRQXpi8RDI4SraijcX91If4NX7nm4K2AruWqbLnUTFiaXzcLBPitp9Ij5KyH3sxspAnykxHFTLfqPDv2Q6QBAyMvDw2TkDMB4dsJAmbiDelw8B01xbdm0j8f4Kemcqy8mjJlEX6cb9lSdqJnYeDisEXEsbqgVnS1ZejTomV0sjFYV3BGZ7oFwtiZa7MjnLhcYQAaaacw3lpSiRqM4yFvPrbV2ZAMl1vpd3YULaO36WZHRvUi8qe1Xwwmj4CHBDeX2moaIdlKxDlksKwvLi9C0hvOXdEBQiBWLA3AUO3pXGs9qIYY0BHolqWQCnXDMUcJBHgGaiT1CRXLydzNk0A3i8QXINindQsCuive0xjpb7YJYzpu7zlXYgmctprr6szyhLBIditlsuTAgu832tLUrnnKc1W4JHh5i9892V0FoD8ct9DOKUlB804IH9douZ97giyttRaIQXPr4DsQyYS25sDsC1h6bFsCqPIqXXbVHHims4hrrWz8f9kPlkBoEeIs5wFDCdmSnGE1hZI95NIH0JYnoBIsTRKLA3pwOlp6M1hvsXr3MIUONrmoZbHUdGyGhuUqeCDC6WM9bfqakuSVEfmf5nO9ayGrrPH5jfnVcbhMapBWAqp59gjAMbpPgYyD5pqD7apEEM66gEhwTLGWIUmrNL2SRTxmkA5BjBPiwmAjMeQFxdi1fU3CONJAR5vqL7mmOCs3nNjNDrMJrUztVBfcydUz5QKW0S045Qs2f8oskGtIolrChroR4zLkvFkW0EJOTNMw5H3ntK6QRDgJDFyfBFOx7r80oYpzPBT3kUi1E7glttb986fOE5nlEeUoK1a23u3gAuCVLfj8eeovwPUgIzvWMvzfKWHPoNoJ41I0FJgr6M59sskx93wX0Olvcm2Jg9Vn5kWIUvQ7A1OYx9p1iLa4UHtiS71l4SYEyiLJWayxixuUGqrnR43eFezXLwf9N0b6LwsPf1b0xkgtFKFF8WR9V1k5VsoQDSxPkn0bXOksiQuvbRdLIMLGSefKY4sUUGYD3I01KicMj8R0yU3hpmZJyqX5meQzZjGNvkTMQfTPTY5jWkPFdRTAR8Y0WGYbw2LSjQU1yH0cE02IhOoRbjrdvq6gVuHx1my8BVHbSNSlp3IfzV6KAfqZQ9OgmXgnndfQ1IE0vdhQUsY5OkAhxlqdEUSL9tA5m8RP3qsgOwiQcu6CKWlS26lK2AetDh2r1njq13KOWL4rKikQaP0qxpS1z8oWSNwCURLEUzeESLuIiuv9drIn5HvjfrmgVFdhfi03MmuI2LLA69UbMsA8GhFki07Ssx4W0WBgYuhYSgh84SZMRyd9n3R5mjqdIW4fUORe6Ql3P0JFYjJTRKeakoqupL6DaYVYGmaxDY77bepszXnDJnbhPi18NiGt11vygNPvw8ppijWSaXxP3pb94IdjATK4f5886oyTSkhhYidWK6qQfXwsDuf8hr3Zhv5Vd5IC7hjnK82XYFkMu0slr2LyqsVnvPvdLFvnxdazJMtwBG1f3ddCDWOgGEtXDLps3dQ6HOxJGSrT9WIGSiuii3ypKAUc1uesbVte228SDOZcAfVMut395Ulw4X8VRw3szjRKKmlOvRgTrC5yYONhXiLC0VLv74wTkLG3QTEPuqKGXJszO0mTjbUmIcGlX4q7sLxNwrZnwDOJPJXh12cwCYJBsqToxoL9tstwxB3x99QzOQMuCAR8BdywJcO6wmc1n0fwVnK7tBalXMiTv8Fns0qURWzagMEOiN488KS0Qj54TRG6AJxWULfa9kfCbXxZycud5yPbmUcE6IQFnNTaKmj7m0U6t9mA5fPveDDtyUqrstxyr4WMtw4FiBwjpNpraKWnzBfi4OBK4iYaCJdPJKRrpaAQyGSJyGedQ9AgFHYh9EZopKZgH5pcBnf2oHfhuBTq0NJGmrKusb4nC1PGjV1jFGpF3r1RgYtWMQte6KUQNCDSW0XGegmU2VVrboOaeAaMWM23WXRxa92e6zMsYxzLgMUpAmXnonfIr9ZPAjzwx0UXLqnWZP99s6Da9DewN2EKXEXQgzllbLdr61pNasr0KeyOq1sRWLuQzXrvergG5q4GKUopJBH02sAFfTUcaxgbTvRUxwqTbvQksLz7KwsirrtzuaxTk5WpL8yg51mjPrvyLUhCGFv7IvQible3seUkmqea9eKwDfvc9ZJNzOzWTkIL5VygG7onS1dlkp7bYFeLl6n2Iy9XKAVDrzSH5zzCEPoihK2NftnXrPwf4YoEstg1zl93WCMJC3mingiyZ3ILTq7hEDvJWzDdtKP5OIlMFHlrexwg2ej5aoa6YO7oi9PgzjfSVlGmGbqv0IggxHhe0FP43CcaqJUfTzYekLHOmhk4to5Lf7ttISxAda3bQOSkxYR4gS7z9GCCgsArwzfgfmw42YyGwydDTMQuSrnLvJXLXvaO8Yn22gkQ3XZ22axJhQaqcAb4lw3oDTeQwSxFDxly8J4U4Vm71rKMWxAyfzDYkRMQMMehpw5bCPLuYUOBYTBIWdQtr4HXze3FNHRDdWAud7EorALu9Q4IwNrvf0Fy0nPivrrLEjE5wBJDH1usLMMTizZSCvpKscVN6NBk2ll8PmEQG01D9lCTOUIXpbbOatipjSTSHgR3lHt30rkmskRXxz6aVYzYaLmBDucf5vhv9IWORxRsP0KfkgqzfoZi9dJ3RZoPaVJ2WoRCwGsFIx8cVPsF6L67kSTNuci9B0TbFUeuaCS1bauz4IUaWB4UZnZOY0hMtDYxc1zOSBM2h8x1QVeOxAbI74Sr6d768RWzx6sSShJ0RIS36zlLFmJ106ogFEqfA81dhSVJflBuFcSHMPwyvkD9YNiIWJCkP8kdoHASwdde7hdnom1OVKZSvjJxibdAJhKGwYdSy4YUsBPGXCtemsMp8Zn0xoQc5nhI2fPkrLPwFMftO4J2IO0sj7rQvtlYylTZFMHzaG5VDzrZrFMRPEnnjAkoKDFhfiXY9iMAK8busZaHjhANoQ5l0ewEkDCqxaU5ej8EffvmEruywI3luXyGkEUpef3qKev3w4Wf5umX5mYQncTAfqmxp0kn1iedSIfJFVu7Tm0IBAuir0T3XakyRgCEyVvytFNwiuvuVDw2HCmfgAnB7NtRvG8045XyirFBpLbgefOVkYh4IK1svdL86Dlh5Jo5mRhBTNf1nEmHcylgMJUWVB0RaVa8d4KwR2Y7cag7BSlC6IXl2Bxh1TZau4TUpSnikx7JtC42nkCSYFJbq1FIHMu2xrNhlrjgvNSjjipLOyv7uOzHUrgoCTcYYFik77ncAbCbFSxC4sjYAWR7rIQBC6Ghyw8eBdCdQyVwznjeeUFWpcWMiJC0C8ZywQllwj2eU0gsZjFHxPsMlU9v5yHJ5un2HlqDMHknrlXWGKctXt0xwcVQx0agFVJh7uRVNiYU5eATpDUPj8MBHekFD9m65mVTbNFOcpxKHliJqjglUmz0TCqCTlrRyjiNM7csOq6BWDt2jyhp0JTpZPTL161cmwgQWfcVPqv0lyNKWeSb8zVr86H9jxBTRu4CfOW7KorFEtkIVqDzynx3c33ruKPapMe83M4rNeXc8enDEMJ34z7urZDCmqXhgghwV8jzG959csv6J2kvfBLoTCH4lm0RRpWVkkx1oGVPTA9UiqDjv10qNXYCF0RUn6vsqGcz213KSfC4vG3X2pSnmk0Au8dEjA0mzhILy0z72zHLStcIuH6ShW3R1xQYr6bdGnsGn34aiz5ztJaVcwszOg31QXUJ4nRCMrVmkdkC5LuhpQcl3vKnzXxPEDHF6jscBfSYCEVNRmV1x1eXtNePKJMfSTvn4NUCitvqYrXVhrlfEIdANYxy5Z8wTaZh2fn3G4jKj4356Ar5bmpRjgcWyGvGjEILm0mknGSNMD2498vzP2wQO9rnM1tTbcVBckAgZCOzW3eYs0tJ25u7uhbLxxJLK3Z5laGpYL3QSUxiaPX1Che7fnMIL20jC9cJ3kUfzuILxSaFXdTuPxz2xox4JZ4yfdvaGDozG4DsTA0o0ZYSQQ8i2l7zNYQ136wputv5lfVrDGZ2bniRuAx99qKFrdTaYy4RuGdBDqS1g9OBwS6Wol0TgA291cWuxgINnACTQT5jcGQ0Z1ovcCkXjYR3Uk2vSz5a7pWxT27ZOSyGZ4s7g48aBaQCQIX9W5P235Aksw3AF0t2FTzYaVLwe5gYnZEtmdeh8CZOH8ZoDh3hcmEKbAH1kxHYqUzileGXWtGU8sR171Wf5Q6CXVnVL3gKJwUyLjESokumSA3vAJXysiuowShTVu8YBj5qqMzq1Jkuc6pg8xAOxfSrQOmDp6ul2uCtQWk5GcGQxzIxwH8RFhWj5p9zFbjHwkgZYDQoBpaZjXGbLYQUbWm3eCp5OTayu5FBljTodojLXszR789I87EadRyK72h4sSaXP5xvOY55puL3JowcjGdR2WdUT2fFIi0Rnr9uvXhzIuLgvpThUXpwHerVwJY5fTKE8ousbnMbUOCIbMjgJXwfey7ukYiTpm7TiRKGdeX3VukYQzKnOnt68IGljXFSJdg6xFMCfMG1pMXIKVc2BSoRgv0mQwljkgd1kNNUzzxHL1qiiH3ZtTD28uRvkUqcFFuKRogN7wtx4TRSePhK0kktAkVPdR1NhpJ3XZvHdpvvPsv16E86p6jIPwlZtPTmJCXUO0CjMIofAmRG2pmcXzxbKwGqn8yIvVy1QkXGc4f81Cd3sKlQIPYIO9HpkVqX8VdilJYyDQNNxqTiU6OOjigP7yF6ND2wXZScCdpsIS7eMNwoZRKJb3ccoioR15fCuHE8WuxuRR0hTC4D2cgPaFhRDKSBNm4EreCCQMC7Y8Bz7w8lLkE2fu3dmgcS3lC0a9XlnprVL9cpKP2mq8BWrGnWz05fyu41iCwGLseWcnqL8wD7zYwymlO1ptmWIOXVFbD6bgfGclFb5mtGsytRzBccdNE2gKIWTShnXPzsCYWsGAyxTPY1k5LPXZLEAayzmxJExoORuG82I5Aqy1yzcr7ew7mUeejyeJfrWPqL2zQCi6c6AMmaVN8BkLrviWw9DYp2slT5QCzNdcJDgC8pV6epUc7QxBTttuU8zfjwQ4ORYNnpA27xngdO8yIYxCand8ajx5kXpcpBPAELg4LJcPDLZ4HqGSUnEE4cUxrEuSnO1dXWOaTlxaiKjiScTM3SpYZrlOm7yhdce4xxvReEVFkHw6ykbfgo1TEk8wDgmIHBtPOoFOvQFCgrRmi0zAYyYRFiRGX3OyHKGs1qAoEWt4cOG6UJZpSGK8jz9BYag57lTE1yck9r29Y8UbsvLvI1NSLhJQLNk9gmLHRr0iGV7QRpYCkdcTz7eWO6VrmfFq42ngWid4gKSBqi0ts7dGYiv361JyHOKA3soVgjJ51dyJ64zdSpuJoa7HpKeUFfmhRA7uq6Ztsg1vmoVBjRdZe6SOLCtux2Cw4HbxDSEJBlVVLr99atQ2POJjzC3p6H5bpJK2HJBFWQJgtHF1WorwFNeL855c3LdIbfS5gU3EGxrfqowdYcC3UdaoLrRBFIjOFlzHXohh7Bo0IWHrFZpSt9QjD5eIvHXoLH4EZCrfNLLzHpBP7IIrKNGpVkDjAJ9soXmcmIJ5xt1hyriyho5X3N8hbuandC8cGRhGH4ba182PxBEbnvIVbZ9jX9hKjjRgABwr3GSPSUvQGbm0aj9myAnieCLALeVmNNGH92sO9FBY2whV3rcJehz9Q0onTQi2ABBXxQYVJMS7xF62g9gTIYHKZAHfu3MlNTqBaYz2N8zVwWsC1hgJnjHcACjMDENgZaCa373ZbPYLPTHqCecOYTGDTOH1cxYsxyjbsJUBvh2OYGZ7ZmBonymiDV6qiaoKDU1RED6gJTKStyGQ8xdZafxDMxPJx1djx4QzpmJEXODrwRR05UpwxCkWH4nG0RqIOE4keNNRx18Xcx9e8DDWnzNeNOW9fQ6klmcLbZRIXsy7Tg7nGdONu8TJkgojnHFbZ3mIBrkmg5HXRRTRIVLKHmNV3JsfEAqMo3eu4d5f8EL1IbAl5sdbbcf9zLbCjJoS0uRxuAXBsEPqlMYiN5krG4dvUtgNzjBoddnPXhAl3OTx04KA2K2W5wtrfcuYD4uthnWbr9fchyvE99UkfthE8dsAuOIX4yiJyrJ3JjLZndVrYhjfb5HrgEXkLgpzywOGzGqzCRwyN1Nmnkpj8zZwxwhtSZkiyafemzBx7pspb0Hr1l2eEjXucPnvccxtAYzTK7fFHDGE3VAe3HawVUikXXPArgK0YfRZWUo3uwJYRe9UspKHswg2pCQgPOJeVryJWgwRfooZdwTzmJO2noRdVLBFrRUQweyOzU4lgVBxVx72fvr4Dj35kd9mXqUq9fzvRBmNBrsTAiOhGc1Xq5z4C0NUvZEOSJa9AEkvCoiMyvj4Q77mJjr8Y0SXAbujXFyL4pQPfiVk3pTLHNSy0UEj3NHdtCmDmn2k5AimwL3VaXXack396CsdgMqTYfeRlNrqyaz8cRAjBKBH3vfrzlxvLs2A7Hk1lMVxCI71YccSW3R6W9uAuWUREXTJpOtCgN4xtLTjQx7CrO4AiIB8XsHROooTmWHaRQZ254NO17hbtgfvUZNlHnRzF6hreYDkUSaKb9AWWQkxYCNKeayjJgrXplfvXQfH6B7jXVXdvhjgOajJsbRCmx5POMmqQKrecNPRcPE9TMOAPLEyHIEd2qp9z576s91Jkiw3xg1RqlkKoY4L5yv8TdKe3BuN4SFNxmbI7KeAM2O9AVInsH37Hlw5cBgFRCoxrOfs9bv7TrjcusBQsSu5JWJ8xAlN4cJt66SHol1j7QHHXHRHZCkdNqGpIG0uwC8jNKVQN32UZxKwNBXJhGkopXu1z4JoWh5fh19ojUVoDVnTGaLqv7cSfofOTBEII4FdfEy04LAzEOMT8Hp1ZkBpHjwJ8IdJ2fX3XZ9b6tcCK2Fv37vAxijBFYaMgUW1Ll4GEgcft599wtnaU1ICeWMJM2tvjQwWUMw0JvtdV06wvBTmlnN35IhWn8wUUDGtwt62ItvhPqSgNHkKR8zaPKvB5Dclnfpw10zaHrQhwdtsxucTqrtdgCe4d5UZNCbZZvMIdgQscG6I0o72ksErRemzSMwW4i2Zejzuwr1N5Ow6jR1BI8EIRRWoDJOQRbMb41ETfiObhOAUOeMafT7NmNHbDncm2VelfwV56pJa1FuAjD9Y7rYvGoE7iJ7wDKLSbgBQN94r1gnRjjixsXr7RCEVz7BZ1YNxpD5BdKvfT4KsJcfh9gYYsLzAc1Hk5ZpBCpZO1QBl2LLeS7HKmWEaZhxVxB1B3tTJz2YGdYtLTByvZeiv3lffi5FTFoUmB1Re9j29FcgWGITRdOcdVG9hbVyZUVQOYKJUtfwRugsoKGsuQx3dvr5VgpAyYXO6ybEqtjgT2YxtsasAXToO5XViTRy3oVkUZvjfSYBdoUbPT2UYr0P4L42tFw4pWi2FEwIU0cKUC7LJSl1TjJB85ZL41J8qUFlwl8MCw1gDMpMnVIdW38R3XqtqUNbRbOvgayTWGiBO1kWDi40Mr8jzRIKxDKJAur0YySwaVocY37ZRelZftCeziaRHX3D3LBY31DiVVwHxHwU0ZLFBohb5KYA0s0W6zdxp7pDUg8KvVRB7l6RqbJnILOH2mUTrSUHBTTLkGh45LtBRwh9pH78Ay4ztrEqetR33Cv0jMy9lVibFxM7PMPPdu9m0HOJVm33Uh0NSV7yVobcp525djoX6vFdmKIJDh8N447bd1DCO28TXxYgfg7sjcVGbzHJNgsM1VHTl5PzYlZzWxxNVGTKK3I70GKY6b7SsksarR1Zk3oo8Qtzr3fq2gxQpF9lDO62IxwfJqyeQplNH7yTeVvgFpnCSLx00P74GwGsr4rmc76lYXpiacpMEjgYHNcKry03zGIc6zVtbtyVsOGFZiwQezl4U2h18iwLRVswJlFhED2cqeUX7cGfv5G15iL3xYBtsSE4dCqzikh8eupJBuZLPuRY4kcSbzLFJufOusgSvb7WRfyN2IhWA8mIPiQseUywNbxi3QXXFdAmPwMC1i9JQ8Hx2tWzyeM0MleFXLs0kYCJgrKqLuSIwhA3oc4uXzbSz58oGIMTW4aE25zZXLUwXCcSRf2NMjI6xOkoycrZ8NGiV6PATiSQkGq7MxW5qc7SqEsrDSJRviAf2q1mVegXqKBRxOY3zR3SMalTGPaeM9P9HJQISqlPzOX3hOwNi6vR5cypj91hRoVkUBbZaG9yAdzkfQLrnt9NXHl2E8j0mPAH39tQSlBIirKMcmc29GSHoTAsfA3XxQz4i5X5l45Oxn9tdhFtFnPDwfAh6kFWMpXpKCyguqQ9N4w7W2en9305aKhYkzn5yWPtOt6LnDh7wHupLrDP9chweXsqqtxr46XuORLctrWlwIYVLnOG80RYOLghr4F2dyfAer9aMBaSPpybx1JFgOOICmPCRG8SiQQpX3sPNmT3Cj36vTi2OC6hM0RxgWOFqrMmwAMm4wBB0ukft22DLX7QbKVXkiNdoVDq6Q0BIF9U8gV4E40kvqqnwM56loEfBuYiFePURX1aw9iR3rRhaeY0CHwfV7BEyqKNUFDDzjXzCSV2dtxcv0MJd6mu1XhkICEt1z288tjPrVv2j1qfe6U9vHrO9jGJIFCAuwJe7ouLXAycZDqZ9iWvJETOU4sYGWJFFzFJeHsA63xfX96Q0E4WJTkhKE8OGm0WrhglV5ZAnwbNQBDZ52fuw3Y3rQXiU87KCZTJoWs8GP3QQKVMSR6sFtAGyEJ3dOIk8tMMYxhcZPj8cDQ3BIA4qIBRywGTmpiOZ7sYS7zO5Df55IAeNDUvmETy5QQku6kQjfeseFqgrGmvdE17mD6zCcDmspf6IFZWEBC2pb55VhOIfX3Q5tfnfQBmisyirJZKy9QTTNwl09MomyBKw8XeBBgoc7pd2CRh17QT194OyAlwYNDM4KBOAqOAPkTmUhipiFzDemBPUDN3mVuVNVx7ZaJpOJF6aEvolnT9PiFMLRbOkFgiZAsQRIEyqdlTLt4fDpD0U9SNfYDmm092bwmLhKSvfQNmnXaHUz21ooTAkFcIVPatIhayhRO5MCFPjEf3Fwdp2brMVMI6C4ic4Win5qw9WiEBn5NqNAS4HSmX7GnFYvhs1OpJ3wZIEyT0CDj0woaBen5j4aAqXdpzFNWvxDL2UrhRSIFiJ5ELoER2nWqFEK8fOd5mFSXcqJlcejevLOqmubhuYdVMqtQ2WlRDjUNJVgAQ5JxE5c49B9R3aLy1hNyedOS1ApHn0RJ6iAxWMLFOsiajlhu4vCP1uNe8EMbAnu0pmaRAs0e5OpPpJkS91DTcZQ822fZ5tjkY7ATkxJcLDDt5pVKLTIs6HnAlrzVFjnUEvMrQSIo9xbDptTNOMqQ4shGSYe8QiAW4vLLoOvjJkXrghwWUnvsYcLKvwAQ0cTNE7g1xX6KOBDdBT8rvGDrAW1803702BohvPMX3ES5elx5B4NkL3OUoAP2b9jdzzodGhl7IdyEUPtA9QqwLVV1rGMFpNbxow5E5jVAgJWvOnjoe12JQCMCXoSWWAPZDsE4mxD3HxshEChzpgKulSsr5eKwmLMAKaYz5GCzIXdfTP9r628oezuYzxSvevAx1K6k5JqGgiKcZ7kFDXfNrr7UnpzgMN0kpgmdjgnKMJvViedVmlwJ2VX5CtZWCg7VthFpQYkxj2xca1IfTUrb8rk01x9q2BO7j0FliN4fY09cTfuCzl9E49HKFpkLKznUa4itV7KUIx1KMpxI0Ya7pX0ahwXG3imxmOYxBN0PtilhKeQ97N7QwdQdmSvapAp4auFBSyfSWsVqDg0JRRODf6VRsLkkM7iiB94X2lIYiLA1EnNqAloGAHOltdg1F2FMfKRLh518xhJBdBegSfzAQbBGZ7f88mYi4Ei55FQW8p5bdGbcubd8M7sdWLcjavWNgtEfr1wr2SwV5FFRWZVu313BQtPrZ2ODBepRy3Tu0HZPE1GCsNJvZMiKIWpLSa5PJRl5HAPfJVbwCLFNrRpFyT1Ks8G6knrdjURUPLPfYITjIPH49yz7NTViCmiWkDc7GFSn6PRd1hh22c8GmWzIiYIORcIGzZrkwYu3L2ECIFhmH4CWMlrTztubS5UgFYdF9jREkAkjfLp76QPEqf4TABPSHtK7uucZw5rkjuoFQNdhauUWvJcleChUAsGlliLcxDajTkzvYBUunt7GCHuRPYanIIbSZiOLl5GtbxnspE5QYf582X4yWFz6ohCNFhBwfXUOTqKWhFMH6Z7QN2YqWpFkktii51bj4eDXXK7bstq9zoPKB7FW7KZtPGEAAr7VV5ALuKmkFkW60rcG6xHsmxiv1ZQp3ejvz5NquIzSLoCiGK08BIBVN8yTqJLRZM3aX9U4EcNPLHXXtC3pOdgwUa99n4fwYx1ddOIADFhQDVLHQy31RWFPipqRUWSQR2AYDlNdjmdyTqJBe5CO91cswO60hk7FAHjGlu48zZgco4oDD7tLn9BSafwh9uxhNjhTXsWbeMO1t4DJ8dAZd6BGThjcfLCPFawfBfSvTNt73Cq6AkmRMSRdxnwPjAYbL3Cm4wPRJ2sKK5raRuC1MNY126QD7GzDugGoOYgF0cTckbeu6VaofNMkqyDqBdOVpQPXHZX68wbqTKoKsv3x95vxWAXVN4TyjtsWySFPKIlbQhojHJ3c975jlCo3rhyCquVnm9FiYnCY80aRqG0ABcSAlqUHQljbHPRqzVaIKXWCheEps4cSdb4RNRHyyrf84380almWd9DmTqfCOtJE3KI8dcuptL7KEtJe87Ggfls04N1syrYEY0ZOg6l4EQBc2J3aM5jquilOc2yeGtoPdb2dyv8mScjtE4msoH9GNrjZOMHDX3i9okVVwqnyqJhADSg5EzK5EpEfG7yxKgcYjrs9pOAnNs5OzhWlf6YcZ4VhflXm4amuY5Tf7AbhuxOjxiFglR1oxPLRMOBqEOlYwTe1Hb7u4f3pm4fX5w0GczNdojb7IffhYwZ2zju8gQEidySAoxgWQAiZmSxeJ21WCE4YcDulyL7NeqSeHN3tdmNfMQ9nwkjm8geu40n3YZBeNJZkuLlNvrxx0Ah8lVBuAC6BfWPkcKWsUQYX8rShiWIbHJJrYzO0D27TGWPZE1rbL8z4zuqebzAh5ZDy2JtQ217FvlFOTol0kJVjB6QE9Sm7P4jEKG4RNSdKO8Ixb6dAETYOKSoC3iFvjHyykZjovvDCACkUneTLgVhrAL8Cminu4Vsbsmz8SMHtVv3MTE6TlWJMksOT591LeJqLzdwc7gWZYRczzbY5btLbVyCUKkCShiiz8NU5xlLNtl0bXfxPrxyskz4ZzMZEFNM5ryjJUasT3pSnuYjcwdEK5ri6FaIArc24MWcmWJivb8NyLyabrksePJORoLbHRRnMbMmYM49JZPcUa2Hf7Jc4zIn0Mf4emMXWpKJPuR9qkhIIMaA8cWM9T3mgprc8Unb4LGqhDXptyr1dmEYKPZRSZv0ILKZSeSWZVYTZ1HWoSlVZFg5EiKj9QFBtthcCZdrz3sc8ugcdRboHXGEH5SSbfZVM9cy9JOtL2nO2ZDmMUx3dSZm8zXIBL5Upom8Z1PHkGIvXa8qgh4bJDzSTIyrLL9TCHSCiltWW2MzDkcZu0eBKyKrpJ9jiqnvLgBN6tsuDkhtBKk1IYi9MYEYE6YjiRSQhZubtUVYKopCpRM4igc8StqsjSRvZShI4ei2cYCW0yKQmhKq6QeiJFECrUHXrZK0fE3eAuXNunXrPcXK8daorknhmyOIXO2xa1CtEwCwAd4cfDx8sJ3RZIdSwnKTLqMQN38ovShWkiIGpFMpTi4LUYl32DMKwjodXGHslYJaSxv02qTo2kheqhcfqlQrEQQJ5tvdmLKGaoCKYiwAoVHgDS6QD6q21YTGzRzAYcdNR7RwVwY0LBH6Jd7hHRriexAG8uvEgLRdKEvVoAsowVVsgfaKNSQBQgV6bAHgVZYobCPbsY3OCbVxSyjcbWCtJ4u5dtpFrRE05hulVBaXXw24tsoT1139nWeiDllBxqFCOlWphpb77PhzUPegsI7YDPKiXFqQB7PsAvhHVNm9X04FN4PPHTRchrRmc8BQ1xfP36mVCMTN1W5zJovdFyvd5EG4kymc456KMefh7oOCZZP0JVDc4WiNmK1pdN492zBOTY1Jbsi5HQXSs5lpYHlH56AdvaAbeLSt1izQ2Lbk5V6UgiBL5AwgZmHRkdhAcZYW0elh5CjdKT1cg1ngJMbSHKl756Pyioc73ZQvdXOuwHK7ClMQ6VuNdJnoy0sExffYE1AKO4ORHJnbXwjqZnmNJuYP6FR3vzCv4OXgzg6sh24oy9ofUYX6VMdi90UabMOcA9YIzfTKCtA9dxgvVHnUigz97dkthRKk71br0wJTsv01Sk9KjD6CdNSoWaB1tm3T6ADV3dA9DKj2oX1YsFovUgcDtmkhjmC7hdKi6Up4PeIYlfvg8B3cZtncNy9bhWHHtOw1PtKLTcgPpJyW2a8SjsyNxnLLtJosf6gJLrPyYAA55q87bDTYS6iUHriTnrjlRTPGNqWRMsPWLRxKboaHKnCC8RBPAFOgiJhNROJ2rGDWrKSWnaedxVczwoCswxbBmcCwr6YysIgDih5ewdgZvCWaR4gakVHyyMcVsFhOaR6UZRV4cXcREanMAbMPNnyrZXCybbVua5kgXMKbYehmM8qZZi8rhkbcjquAFQnEUjWYhL4vZk2PR8wk0TIMkRx0NWSAR8PeJkRx03JAj8aSsgMWtxCfX1yMR02IXYT5kEMKws1Cq8bUG1NMYfkzgcgKJeHpvZzWPmhhqSKUOZnqvng5RgkUjygToobecX80JbjD7rwezbeZaLy4Irl82fX1FUyrSu5xEwVTRpnkGG6BMLKqDdUhuTu7ZcmztjDuxP0w4MaE2yrbbkctbNsaHTEtIzjFbKdocRZvQjJlW2bn3uYKiGXoVo806B49oRmZToPgsUJ8i3wBXRDnfscilqCMfjsOcXo8i2PjLvXJbUZSlJbWeDAVzyBUrYQamdcZlKM1Aqnte2hwqus2J9fq4PbIxhXgDt2z6pEqhPHiiAdzoesKxW38g22DFSrnYwryekslCqlQhVkdAkoDULSlR5gz4SQyVobvWalGpAo2MEwmLaT70qItgCHv0TVGvmQRyTzK201LGrd8slwwmHl9yYiQgu2nrNC3WvetveHdAP7UlhnY1CtNG9tQFHzmefW6ohg9v0
END_OF_TEXT
Use the slow regex on it and note how slow it is:
text.gsub(/(?:([\w+.-]+):\/\/|(?:www\.))[^\s<]+/).to_a
Use the fast regex and note how fast it is:
text.gsub(/[^\w+.-](?:([\w+.-]+):\/\/|(?:www\.))[^\s<]+/).to_a
I figured out that this problem is specific to the type of data I used in the example (not a lot of spaces). If you run it against RFC 3986, which is much longer, both versions are equally fast.
The first pattern is slow because it starts with an alternation and the first branch of the alternation is very permissive since it allows any number of words characters or dots or hyphens. Consequence, this alternation takes a lot of time/steps before failing.
The second pattern is faster because (?:[^\w+.-]|^) (that is an alternation too) works like a kind of anchor. Indeed, even it is an alternation, it is quickly tested because the first branch matches only one character and the second is a zero-width assertion. So it takes less time/steps to fail. (in particular because it must be followed by a word character or a dot or an hypĥen, that is a binding condition)
But you can write this pattern in a better way. Since your are looking for urls, you can be more precise for the begining: the url can begin with, lets say, "http", "ftp", "sftp", "gopher", "www" (feel free to add other schemes if needed).
So you can describe the start with:
(?:https?:\/\/|ftp:\/\/|sftp:\/\/|gopher:\/\/|www\.)
To limit the cost of the alternation (5 branches to test at each positions in the string) you can use two tricks:
you can use a word boundary to quickly skip positions that are not the start or the end of a word:
\b(?:https?:\/\/|ftp:\/\/|sftp:\/\/|gopher:\/\/|www\.)
you can add a lookahead with the first letter of each branches, to quickly avoid uneeded positions in the string without to test the five branches:
\b(?=[fghsw])(?:https?:\/\/|ftp:\/\/|sftp:\/\/|gopher:\/\/|www\.)
So you can write a more efficient pattern like this:
/\b(?=[fghsw])(?:https?:\/\/|ftp:\/\/|sftp:\/\/|gopher:\/\/|www\.)[^\s<]+/
In short: a pattern is efficient when it fail fast at bad positions in the string.
An other possible design that uses more memory and needs to check if the capture group exists for each match, but that is faster:
/[^ghsfw]*+(?:\B[ghsfw][^ghsfw]*)*+|\b((?:https?:\/\/|ftp:\/\/|sftp:\/\/|gopher:\/\/|www\.)[^\s<"&]+)/
(the idea is to divide the pattern in two main branches, the first one describes all that you want to avoid, and the second describes the urls. The effect is quick jumps to key positions in the string)
Note: when patterns begin to be long, you can use the free-spacing mode (or comment mode...) for readability and maintainability:
/(?x)
\b (?=[fghsw])
(?:
https?:\/\/ |
ftp:\/\/ |
sftp:\/\/ |
gopher:\/\/ |
www\.
)
[^\s<]+/
or you can use a formatted string and a join as suggested by Cary Swoveland in comments.

Ruby gsub same format for all 3 but 1 of them outputs differently [closed]

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 10 years ago.
Im trying to add html tags around certain text criteria via regex .
Right now, I have * and ** working (adding h1 and h2 tags)
I have the same format for the li tag, but it adds the closing tag next to the opening, which is not what I want.
Why is the output different?
:::TEXT FILE::::
* this is an awesome *
** this could be something better **
* another test *
- li are good for lists
::: END TEXT FILE :::
:::MY OUTPUT:::
-bash-4.1$ ruby markup.rb testMarkup.txt
<h1>this is an awesome <\h1>
<h2> this could be something better <\h2>
<h1>another test <\h1>
<li><\li>li are good for lists
:: END OUTPUT:::
:: Ruby File ::
#!/usr/bin/env ruby
text = IO.read(ARGV[0])
text = text.gsub(/^\*{1}[^*](.*?)\*{1}\s/) do
"<h1>" + $1 + "<\\h1>"
end
text = text.gsub(/^\*{2}(.*?)(\*{2})\s/) do
"<h2>" + $1 + "<\\h2>"
end
text = text.gsub(/^[-](.*?)\s/) do
"<li>" + $1 + "<\\li>"
end
puts text
:: END RUBY FILE ::
In your last regexp, /^[-](.*?)\s/ you should add $ in the end of the regexp to match only the whole line starting with -.
/^[-](.*?)\s/ matches string " -" and replaces it, leaving the rest of the string unmodified.
FYI, you can simplify your solution, turning it into a more generic one. Declare a hash with pairs of type Regexp => tag, then iterate over it. You will like the code! Feel free to ask for further help.
P.S. I guess, you are trying to convert Markdown into HTML. Allthough it's a very good excercise to write it yourself, you can use a ready solution, like Maruku.
The crucial difference is that, in the regex for the tags h1 and h2:
/^\*{1}[^*](.*?)\*{1}\s/
/^\*{2}(.*?)(\*{2})\s/
the string to be captured, matching (.*?), has to be followed by a special terminating symbol \*{1} or \*{2}, which lets the non-greedy match to wait for such special characters, whereas in the regex for the tag li:
/^[-](.*?)\s/
the string to be captured only needs to be followed by a more general character \s, which is already met right after -, so the non-greedy match does not extend beyond that.

Variable Declaration Regex

I'm trying to make a simple Ruby regex to detect a JavaScript Declaration, but it fails.
Regex:
lines.each do |line|
unminifiedvar = /var [0-9a-zA-Z] = [0-9];/.match(line)
next if unminifiedvar == nil #no variable declarations on the line
#...
end
Testing Line:
var testvariable10 = 9;
A variable name can have more than one character, so you need a + after the character-set [...]. (Also, JS variable names can contain other characters besides alphanumerics.) A numeric literal can have more than one character, so you want a + on the RHS too.
More importantly, though, there are lots of other bits of flexibility that you'll find more painful to process with a regular expression. For instance, consider var x = 1+2+3; or var myString = "foo bar baz";. A variable declaration may span several lines. It need not end with a semicolon. It may have comments in the middle of it. And so on. Regular expressions are not really the right tool for this job.
Of course, it may happen that you're parsing code from a particular source with a very special structure and can guarantee that every declaration has the particular form you're looking for. In that case, go ahead, but if there's any danger that the nature of the code you're processing might change then you're going to be facing a painful problem that really isn't designed to be solved with regular expressions.
[EDITED about a day after writing, to fix a mistake kindly pointed out by "the Tin Man".]
You forgot the +, as in, more than one character for the variable name.
var [0-9a-zA-Z]+ = [0-9];
You may also want to add a + after the [0-9]. That way it can match multiple digits.
var [0-9a-zA-Z]+ = [0-9]+;
http://rubular.com/r/kPlNcGRaHA
Try /var [0-9a-zA-Z]+ = \d+;/
Without the +, [0-9a-zA-Z] will only match a single alphanumeric character. With +, it can match 1 or more alphanumeric characters.
By the way, to make it more robust, you may want to make it match any number of spaces between the tokens, not just exactly one space each. You may also want to make the semicolon at the end optional (because Javascript syntax doesn't require a semicolon). You might also want to make it always match against the whole line, not just a part of the line. That would be:
/\Avar\s+[0-9a-zA-Z]+\s*=\s*\d+;?\Z/
(There is a way to write [0-9a-zA-Z] more concisely, but it has slipped my memory; if someone else knows, feel free to edit this answer.)

How to conflate consecutive gsubs in ruby

I have the following
address.gsub(/^\d*/, "").gsub(/\d*-?\d*$/, "").gsub(/\# ?\d*/,"")
Can this be done in one gsub? I would like to pass a list of patterns rather then just one pattern - they are all being replaced by the same thing.
You could combine them with an alternation operator (|):
address = '6 66-666 #99 11-23'
address.gsub(/^\d*|\d*-?\d*$|\# ?\d*/, "")
# " 66-666 "
address = 'pancakes 6 66-666 # pancakes #99 11-23'
address.gsub(/^\d*|\d*-?\d*$|\# ?\d*/,"")
# "pancakes 6 66-666 pancakes "
You might want to add little more whitespace cleanup. And you might want to switch to one of:
/\A\d*|\d*-?\d*\z|\# ?\d*/
/\A\d*|\d*-?\d*\Z|\# ?\d*/
depending on what your data really looks like and how you need to handle newlines.
Combining the regexes is a good idea--and relatively simple--but I'd like to recommend some additional changes. To wit:
address.gsub(/^\d+|\d+(?:-\d+)?$|\# *\d+/, "")
Of your original regexes, ^\d* and \d*-?\d*$ will always match, because they don't have to consume any characters. So you're guaranteed to perform two replacements on every line, even if that's just replacing empty strings with empty strings. Of my regexes, ^\d+ doesn't bother to match unless there's at least one digit at the beginning of the line, and \d+(?:-\d+)?$ matches what looks like an integer-or-range expression at the end of the line.
Your third regex, \# ?\d*, will match any # character, and if the # is followed by a space and some digits, it'll take those as well. Judging by your other regexes and my experience with other questions, I suspect you meant to match a # only if it's followed by one or more digits, with optional spaces intervening. That's what my third regex does.
If any of my guesses are wrong, please describe what you were trying to do, and I'll do my best to come up with the right regex. But I really don't think those first two regexes, at least, are what you want.
EDIT (in answer to the comment): When working with regexes, you should always be aware of the distinction between a regex the matches nothing and a regex that doesn't match. You say you're applying the regexes to street addresses. If an address doesn't happen to start with a house number, ^\d* will match nothing--that is, it will report a successful match, said match consisting of the empty string preceding the first character in the address.
That doesn't matter to you, you're just replacing it with another empty string anyway. But why bother doing the replacement at all? If you change the regex to ^\d+, it will report a failed match and no replacement will be performed. The result is the same either way, but the "matches noting" scenario (^\d*) results in a lot of extra work that the "doesn't match" scenario avoids. In a high-throughput situation, that could be a life-saver.
The other two regexes bring additional complications: \d*-?\d*$ could match a hyphen at the end of the string (e.g. "123-", or even "-"); and \# ?\d* could match a hash symbol anywhere in string, not just as part of an apartment/office number. You know your data, so you probably know neither of those problems will ever arise; I'm just making sure you're aware of them. My regex \d+(?:-\d+)?$ deals with the trailing-hyphen issue, and \# *\d+ at least makes sure there are digits after the hash symbol.
I think that if you combine them together in a single gsub() regex, as an alternation,
it changes the context of the starting search position.
Example, each of these lines start at the beginning of the result of the previous
regex substitution.
s/^\d*//g
s/\d*-?\d*$//g
s/\# ?\d*//g
and this
s/^\d*|\d*-?\d*$|\# ?\d*//g
resumes search/replace where the last match left off and could potentially produce a different overall output, especially since a lot of the subexpressions search for similar
if not the same characters, distinguished only by line anchors.
I think your regex's are unique enough in this case, and of course changing the order
changes the result.

Resources