I work on a WYSIWYG plugin for DokuWiki that uses the CKEditor. It's been in use since FCKeditor days, and no one has ever raised any security before. But a user recently raised the question as to whether the Scayt spell checker was a security risk because of how it is implemented, i.e. passing textin parameters from the wiki to the Scayt servers in order to check spelling. On a public wiki this would not matter. But when a wiki is closed, internal to a company or on a personal LAN, does this potentially open up the closed wiki to a third party? I would appreciate any information or views.
For the complete exchange of views on this topic see: https://github.com/turnermm/ckgedit/issues/434
Related
My client needs data scraped from a website. I am planning to use php_curl. The problem is, the site is using Google reCAPTCHA. Few powerful data items are visible only when you click "show this information link". then the reCAPTCHA appears in lightbox and vanishes, and information is displayed.
I have checked the source html, the protected item is actually loaded when someone clicks, and there is no way for me to automate this click. I have even tried to open the site in iframe and then use JS to click it, but it fails as both domains are different. I have also tried to use Selenium stand alone version but its downloads are corrupt.
Unless there is a design flaw with the website, the reCAPTCHA will prevent you from scraping the material without human intervention.
Technically, your best bet is to employ humans to solve CAPTCHAs all day and write some software to automatically scrape the material it protects for each one they solve. A number of viable businesses have been created this way, where the data is valuable and there is a genuine public interest in opening the data-set. (For example I heard that flight companies use CAPTCHA devices to prevent price comparison sites from driving down the cost to the consumer, and I'd argue in such a case there is an overwhelming public interest to defeating such defences).
Morally, however, you would need to tell us what you are doing in order for us to advise you. It is possible your client is merely planning to steal other people's material and then attempt to monetise it for him/herself, even though they had no hand in creating it. That may breach some copyright laws, but moreover, they (and you) need to decide if the scraping is fair.
I am facing the same problem but resolved it using clear my cookies in httprequest in useragent after clear cookie wait time function (tread sleep) for some time and then start scrapping again. But I am doing this in C#, not in PHP. Applying this logic may help you.
Locally I have apache running, calling http://home/s.html
There by Javascript I fetch the xml-file from a flickr-search. I visually want to represent the search, by inserting html in the s.html-page.
The images are all blocked.
Random example for blocked file: https://farm9.staticflickr.com/8120/29612550501_6162ed8901_n.jpg
This is adressed in other questions here on stackoverflow, but not really answered for the following detail:
Is it possible to somehow circumvent the tracking protection using
Javascript?
P.S. Firefox seems to use a protection list organized by disconnect.me
This list is really running wild! It even blocks the small avatar-images (under https://secure.gravatar.com) that are used on the Bugzilla website itself! (Random example page: https://bugzilla.mozilla.org/show_bug.cgi?id=1101005) On a more philosophical view, I wonder if the Firefox developers browse themselves with strict tracking protection, and therefore see their own bugzilla-website slightly broken?
I will post on this topic on superuser, but with an approriate question for superuser. Here I only want to know, if it possible for me to circumvent that blocking with my javascript!
Closed. This question is not about programming or software development. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 5 months ago.
Improve this question
Some results on Google Search comes with AMP (Accelerated Mobile Pages) icon on theirs links, at least when using a mobile, as soon you click on the link instead of loading the site, google show you a cached version of it rather.
I want to disable this behaviour on my results, I see at least two good reasons for it:
When sharing the link it is a pain in the neck to have the huge google URL in place of the shorter one would be just with the original one.
Security: when you access any site and see a URL other than the site you wanted to load, you should distrust it, even if it looks like google (remember, you can get phished or even get caught in a trap hosted on gsites), Google should respect that instead of encouraging users to trust it just because the url looks like google! Even worst if combined with the first reason and you want to share the URL with a friend.
I have to remove the google AMP prefix ever and ever, there is no advanced search option or cookie that makes Google give the clean URL?
According to the AMP project FAQ you cannot:
By using the AMP format, content producers are making the content in AMP files available to be cached by third parties.
As a content producer I dislike Google adding their own URL, and branding around my content... From the consumer perspective looks like the content comes from Google. They say it is to improve speed, but you can see Google's intention behind this "free" technology.
A simple hack is to keep using AMP guidelines for the speed it provides to the page, but violate one rule (like add you own javascript that does noting).
Once pages have an error, google will not cache them.
By publishing AMP pages you let Google or any other AMP cache store and deliver your web page (which surprisingly seems to be legal):
Caching is a core part of the AMP ecosystem. Publishing a valid AMP document automatically opts it into cache delivery. (https://www.ampproject.org/docs/fundamentals/how_cached)
To stop AMP from caching, the project recommends to invalidate the format by removing the amp attribute from the <html> tag. I propose something else.
One thing I always disliked about AMP ist that it requires you to embed the JavaScript code directly from their server (https://cdn.ampproject.org/v0.js), effectively telling AMP about every single visitor to every AMP page. Embedding the code from your own server stops this privacy issue, disables caching, and still gives you the framework.
To do so you can build your own AMP framework using the source code:
https://github.com/ampproject/amphtml
But it's much simpler to just copy v0.js and all the scripts it fetches to your own server.
Odd because google says to remove the "amp" from the tag to not cache.
It said nothing about loading the js locally.
https://amp.dev/documentation/guides-and-tutorials/learn/amp-caches-and-cors/how_amp_pages_are_cached/
Is google wrong?
I would like some advice and a little help.
A recent project I have been working one had its version of SagePay upgrade to use the version 3 protocol. The side of effects of this meant that users are no longer able to edit their billing address details. The previous protocol v2.23 did allow this in low profile mode and in normal mode.
Further more from what I have understood in the new protocol, iFrames are no longer being supported.
I have read the documentation which suggests downloading the custom templates (Card_Details.xslt and Card_Details_Low.xslt), customising them and sending over to SagePay in order to add these to the system. But I am unable to find any instructions on how to achieve this.
How I allow the customer to have the ability to change the billing address?
Any help will be appreciated.
Thanks
Change the template you are using (in My Sage Pay) to Default. That should set things back to how they were.
FYI - iframe integration is still supported in v3.00
Are there any services out there, that can parse a website and give some sort of feedback to how search-engine friendly that website is? And perhaps even suggest changes to the mark-up to improve indexing?
Think W3Cs validation services.
Try Google Webmaster Tools. After you add your site, it will often list "problems" with your site, such as duplicate title tags and meta descriptions, and also things like 404 errors.
If you are a GoDaddy customer you can validate web-crawler friendliness on your hosted sites.
Tools to automate alteration of markup code for any objective are a horribly frightful proposition. Simply write your code correctly the first time. If you have archaic code then it likely has many other problems in addition to SEO, and automatically imposing global changes can expose problems you may not be prepared to address.