Regex: extract CSS styles from CSS file - ruby

I have a big CSS file with all CSS we need for our internal framework, but I only need a few of the styles. So I want to extract the style I want. I used regular expression to extract them:
cssFileContent.scan(/\.#{cssName}.*?\{.+?\}/im)
In Ruby, scan means extract the patter from string, cssName is the CSS style name
i - case insensitive
m - dot match everything so that \n will be matched too
It gives me some style blocks, but skip one every time. For example, I have .abc-style { } and .def-style { }, but result is like:
.abc-style {
}
}
so def-style is skipped.
Can someone give me any point why? And how to correct?

Instead of using a regex I would use a CSS parser to do this.
There are plenty on CPAN to chose from, for e.g. CSS, CSS::SAC, CSS::Tiny & CSS::Croco. Choose the one which best fits your needs.
Here is an example using CSS::Tiny...
use strict;
use warnings;
use CSS::Tiny;
my $css = CSS::Tiny->read('your_stylesheet.css');
my $new = CSS::Tiny->new;
# styles I want to extract
$new->{$_} = $css->{$_} for qw/.abc-style .def-style/;
$new->write('extracted_styles.css');

Try excluding the closing bracket and make the collection greedy like this:
cssFileContent.scan(/\.#{cssName}.*?\{[^\}]+\}/im)

Related

Is it possible to sort with gsub (anonymous function) like ruby in Scala?

Im new in Scala. I need to know if is possible do something like this in Scala:
input2.lines.sort_by { |l| l.gsub(/.*?\+(.*?)\+(.*)\n/,"\\2\n").to_i }
Please help
It looks like you're trying to sort strings by a sub-section within each string. To do that you first need a regex with a capture group to select the region you're interested in.
val re = ".*\\+.*\\+(\\d+)".r
Now you can extract and modify what was captured and use the result as the sorting rule.
lines.sortBy{case re(n) => n.toInt}

How to inject two variables next to each other separated with a space?

The expected HTML result is as follows:
<li>description1 name1</li>
<li>description2 name2</li>
<!-- ... -->
Where the list of description-name is known and can be iterated over.
I tried to do:
li
= tool.description
|
= tool.name
or
li
= "#{tool.description} #{tool.name}"
but it seems like an ugly way to achieve that.
Is there any other and elegant solution?
You can use interpolation directly in both Slim and Haml, so you don’t need to use = and quote the whole string.
In Slim, you could do:
li #{tool.description} #{tool.name}
and in Haml the only difference is you just need to add the lead %:
%li #{tool.description} #{tool.name}

How to prevent CKEditor replacing spaces with ?

I'm facing an issue with CKEditor 4, I need to have an output without any html entity so I added config.entities = false; in my config, but some appear when
an inline tag is inserted: the space before is replaced with
text is pasted: every space is replaced with even with config.forcePasteAsPlainText = true;
You can check that on any demo by typing
test test
eg.
Do you know how I can prevent this behaviour?
Thanks!
Based on Reinmars accepted answer and the Entities plugin I created a small plugin with an HTML filter which removes redundant entities. The regular expression could be improved to suit other situations, so please edit this answer.
/*
* Remove entities which were inserted ie. when removing a space and
* immediately inputting a space.
*
* NB: We could also set config.basicEntities to false, but this is stongly
* adviced against since this also does not turn ie. < into <.
* #link http://stackoverflow.com/a/16468264/328272
*
* Based on StackOverflow answer.
* #link http://stackoverflow.com/a/14549010/328272
*/
CKEDITOR.plugins.add('removeRedundantNBSP', {
afterInit: function(editor) {
var config = editor.config,
dataProcessor = editor.dataProcessor,
htmlFilter = dataProcessor && dataProcessor.htmlFilter;
if (htmlFilter) {
htmlFilter.addRules({
text: function(text) {
return text.replace(/(\w) /g, '$1 ');
}
}, {
applyToAll: true,
excludeNestedEditable: true
});
}
}
});
These entities:
// Base HTML entities.
var htmlbase = 'nbsp,gt,lt,amp';
Are an exception. To get rid of them you can set basicEntities: false. But as docs mention this is an insecure setting. So if you only want to remove , then I should just use regexp on output data (e.g. by adding listener for #getData) or, if you want to be more precise, add your own rule to htmlFilter just like entities plugin does here.
Remove all but not <tag> </tag> with Javascript Regexp
This is especially helpful with CKEditor as it creates lines like <p> </p>, which you might want to keep.
Background: I first tried to make a one-liner Javascript using lookaround assertions. It seems you can't chain them, at least not yet. My first approach was unsuccesful:
return text.replace(/(?<!\>) (?!<\/)/gi, " ")
// Removes but not <p> </p>
// It works, but does not remove `<p> blah </p>`.
Here is my updated working one-liner code:
return text.replace(/(?<!\>\s.)( (?!<\/)|(?<!\>) <\/p>)/gi, " ")
This works as intended. You can test it here.
However, this is a shady practise as lookarounds are not fully supported by some browsers.
Read more about Assertions.
What I ended up using in my production code:
I ended up doing a bit hacky approach with multiple replace(). This should work on all browsers.
.trim() // Remove whitespaces
.replace(/\u00a0/g, " ") // Remove unicode non-breaking space
.replace(/((<\w+>)\s*( )\s*(<\/\w+>))/gi, "$2<!--BOOM-->$4") // Replace empty nbsp tags with BOOM
.replace(/ /gi, " ") // remove all
.replace(/((<\w+>)\s*(<!--BOOM-->)\s*(<\/\w+>))/gi, "$2 $4") // Replace BOOM back to empty tags
If you have a better suggestion, I would be happy to hear 😊.
I needed to change the regular expression Imeus sent, in my case, I use TYPO3 and needed to edit the backend editor. This one didn't work. Maybe it can help another one that has the same problem :)
return text.replace(/ /g, ' ');

Selenium Webdriver + Ruby regex: Can I use regex with find_element?

I am trying to click an element that changes per each order like so
edit_div_123
edit_div_124
edit_div_xxx
xxx = any three numbers
I have tried using regex like so:
#driver.find_element(:css, "#edit_order_#{\d*} > div.submit > button[name=\"commit\"]").click
#driver.find_element(:xpath, "//*[(#id = "edit_order_#{\d*}")]//button").click
Is this possible? Any other ways of doing this?
You cannot use Regexp, like the other answers have indicated.
Instead, you can use a nifty CSS Selector trick:
#driver.find_element(:css, "[id^=\"edit_order_\"] > div.submit > button[name=\"commit\"]").click
Using:
^= indicates to find the element with the value beginning with your criteria.
*= says the criteria should be found anywhere within the element's value
$= indicates to find the element with with your criteria at the end of the value.
~= allows you to find the element based on a single criteria when the actual value has multiple space-seperated list of values.
Take a look at http://net.tutsplus.com/tutorials/html-css-techniques/the-30-css-selectors-you-must-memorize/ for some more info on other neat CSS tricks you should add to your utility belt!
You have no provided any html fragment that you are working on. Hence my answer is just based on the limited inputs provided your question.
I don't think WebDriver APIs support regex for locating elements. However, you can achieve what you want using just plain XPath as follows:
//*[starts-with(#id, 'edit_div_')]//button
Explanation: Above xpath will try to search all <button> nodes present under all elements whose id attribute starts with string edit_div_
In short, you can use starts-with() xpath function in order to match element with id format as edit_div_ followed by any number of characters
No, you can not.
But you should do something like this:
function hasClass(element, className) {
var re = new RegExp('(?:^|\\s+)' + className + '(?:\\s+|$)');
return re.test(element.className);
}
This worked for me
#driver.find_element(:xpath, "//a[contains(#href, 'person')]").click

Is it possible to exclude some of the string used to match from Ruby regexp data?

I have a bunch of strings that look, for example, like this:
<option value="Spain">Spain</option>
And I want to extract the name of the country from inside.
The easiest way I could think of to do this in Ruby was to use a regular expression of this form:
country = line.match(/>(.+)</)
However, this returns >Spain<. So I did this:
line.match(/>(.+)</).to_s.gsub!(/<|>/,"")
Works well enough, but I'd be surprised if there's not a more elegant way to do this? It seems like using a regular expression to declare how to find the thing you want, without actually wanting the enclosing strings that were used to match it to be part of the data that gets returned.
Is there a conventional approach to this problem?
The right way to deal with that string is to use an HTML parser, for example:
country = Nokogiri::HTML('<option value="Spain">Spain</option>').at('option').text
And if you have several such strings, paste them together and use search:
html = '<option value="Spain">Spain</option><option value="Canada">Canada</option>'
countries = Nokogiri::HTML(html).search('option').map(&:text)
# ["Spain", "Canada"]
But if you must use a regex, then:
country = '<option value="Spain">Spain</option>'.match('>([^<]+)<')[1]
Keep in mind that match actually returns a MatchData object and MatchData#to_s:
Returns the entire matched string.
But you can access the captured groups using MatchData#[]. And if you don't like counting, you could use a named capture group as well:
country = '<option value="Spain">Spain</option>'.match('>(?<name>[^<]+)<')['name']

Resources