Dinamyc web scraping with goutte and Laravel - laravel

Is there a way to do a web scraping without define the HTML tags?
I have my code:
$crawler = $client->request('GET', 'https://www.cnnchile.com/opinion/');
$result = $crawler->filter('.inner-item__content > h2')->each(function ($node) {
return $node->text();
});
but how you can see I alaways define the HTML tag where it's the content, it's there a way to get in general the data? I mean without define that
Thanks

You want just the page content? Then use body as the filter tag.

Related

EditorJS : Sanitze HTML in client the best way

I'm using editorJS with Strapi (back) and NextJS (front).
Acutally, I sanitize the data directly in the frontend with a custom parser.
It's working fine, but I've got an issue when I paste content from Word, it adds a span with some style tag.
This is the part when I transform the data from editorJS to a paragraph block
paragraph: function (e) {
return "<p>" + e.data.text + "</p>";
},
//e.data.text = <span style="font-size:11pt;font-family:Arial;>Example</span>
I want to get rid of all the style. What's the best method ?
Should I sanitize the HTML in the strapi or keep it like this in the front app ?

How to display PDF Documents on the browser using a View in Laravel 5.8

I'm working on a web application using Laravel 5.8, I'm new to Laravel framework. I would like to display PDF documents on the browser when users click on some buttons. I will allow authenticated users to "View" and "Download" the PDF documents.
I have created a Controller and a Route to allow displaying of the documents. I'm however stuck because I have a lot of documents and I don't know how to use a Laravel VIEW to display and download each document individually.
/* PDFController*/
public function view($id)
{
$file = storage_path('app/pdfs/') . $id . '.pdf';
if (file_exists($file)) {
$headers = [
'Content-Type' => 'application/pdf'
];
return response()->download($file, 'Test File', $headers, 'inline');
} else {
abort(404, 'File not found!');
}
}
}
/The Route/
Route::get('/preview-pdf/{id}', 'PDFController#view');
Mateus' answer does a good job describing how to setup your controller function to return the PDF file. I would do something like this in your /routes/web.php file:
Route::get('/show-pdf/{id}', function($id) {
$file = YourFileModel::find($id);
return response()->file(storage_path($file->path));
})->name('show-pdf');
The other part of your question is how to embed the PDF in your *.blade.php view template. For this, I recommend using PDFObject. This is a dead simple PDF viewer JavaScript package that makes embedding PDFs easy.
If you are using npm, you can run npm install pdfobject -S to install this package. Otherwise, you can serve it from a CDN, or host the script yourself. After including the script, you set it up like this:
HTML:
<div id="pdf-viewer"></div>
JS:
<script>
PDFObject.embed("{{ route('show-pdf', ['id' => 1]) }}", "#pdf-viewer");
</script>
And that's it — super simple! And, in my opinion, it provides a nicer UX for your users than navigating to a page that shows the PDF all by itself. I hope you find this helpful!
UPDATE:
After reading your comments on the other answer, I thought you might find this example particularly useful for what you are trying to do.
According to laravel docs:
The file method may be used to display a file, such as an image or PDF, directly in the user's browser instead of initiating a download.
All you need to do is pass the file path to the method:
return response()->file($pathToFile);
If you need custom headers:
return response()->file($pathToFile, $headers);
Route::get('/show-pdf/{id}', function($id) {
$file = YourFileModel::find($id);
return response()->file(storage_path($file->path));
})->name('show-pdf');
Or if file is in public folder
Route::get('/show-pdf', function($id='') {
return response()->file(public_path().'pathtofile.pdf');
})->name('show-pdf');
then show in page using
<embed src="{{ route('show-pdf') }}" type="text/pdf" >

How to avoid [Disqus] Discussion duplication when using Pagination?

I am having problem with Disqus + laravel pagination. I have a post model where a post have minipost and for each post i want it to contain 5 minipost.. when I navigate to the 2nd pagination page the Disqus Discussion got auto generated for the 2nd page.. How can I avoid this duplication?
<div id="disqus_thread"></div>
<script>
var disqus_config = function () {
this.page.url = route('home');
this.page.identifier = $slug; // Replace PAGE_IDENTIFIER with your page's unique identifier variable
};
(function() { // DON'T EDIT BELOW THIS LINE
var d = document, s = d.createElement('script');
s.src = '//testblog.disqus.com/embed.js';
s.setAttribute('data-timestamp', +new Date());
(d.head || d.body).appendChild(s);
})();
</script>
<noscript>Please enable JavaScript to view the comments powered by Disqus.</noscript>
The query
$posts = Post::where('slug', $slug)->firstOrFail()->minipost()->orderBy('created_at', 'desc')->paginate(5);
You can't achieve it directly
Because disqus will provide the comment content according to your url.
During the pagination, probably your url will change and you can't stop it.
At the same time as you wanted to show only the same disqus content throughout all the pages in url.
You should do some tricky way to do it.
Way 1 :
Have the Pagination and all the content in iframe, So that the url won't change
Way 2 :
Use Jquery Datatables or some other plugins related to it, and customize it a bit better to have your desired look
Way 3 :
Explore the disqus code and hardcode the url for that particular page
Like
this.page.url = "http://someurl.com/#!" + id;
Hope this helps you

JavaScript code in view issue in Laravel

I put JavaScript code in a view file name product/js.blade.php, and include it in another view like
{{ HTML::script('product.js') }}
I did it because I want to do something in JavaScript with Laravel function, for example
var $path = '{{ URL::action("CartController#postAjax") }}';
Actually everything is work, but browser throw a warning message, I want to ask how to fix it if possible.
Resource interpreted as Script but transferred with MIME type text/html
Firstly, putting your Javascript code in a Blade view is risky. Javascript might contain strings by accident that are also Blade syntax and you definitely don't want that to be interpreted.
Secondly, this is also the reason for the browser warning message you get:
Laravel thinks your Javascript is a normal webpage, because you've put it into a Blade view, and therefore it's sent with this header...
Content-Type: text/html
If you name your file product.js and instead of putting it in your view folder you drop it into your javascript asset folder, it will have the correct header:
Content-Type: application/javascript
.. and the warning message will be gone.
EDIT:
If you want to pass values to Javascript from Laravel, use this approach:
Insert this into your view:
<script type="text/javascript">
var myPath = '{{ URL::action("CartController#postAjax") }}';
</script>
And then use the variable in your external script.
Just make sure that CartController#postAjax returns the content type of javascript and you should be good to go. Something like this:
#CartController.php
protected function postAjax() {
....
$contents = a whole bunch of javascript code;
$response = Response::make($contents, '200');
$response->header('Content-Type', 'application/javascript');
....
}
I'm not sure if this is what you're asking for, but here is a way to map ajax requests to laravel controller methods pretty easily, without having to mix up your scripts, which is usually not the best way to do things.
I use these kinds of calls to load views via ajax into a dashboard app.The code looks something like this.
AJAX REQUEST (using jquery, but anything you use to send ajax will work)
$.ajax({
//send post ajax request to laravel
type:'post',
//no need for a full URL. Also note that /ajax/ can be /anything/.
url: '/ajax/get-contact-form',
//let's send some data over too.
data: ajaxdata,
//our laravel view is going to come in as html
dataType:'html'
}).done(function(data){
//clear out any html where the form is going to appear, then append the new view.
$('.dashboard-right').empty().append(data);
});
LARAVEL ROUTES.PHP
Route::post('/ajax/get-contact-form', 'YourController#method_you_want');
CONTROLLER
public function method_you_want(){
if (Request::ajax())
{
$data = Input::get('ajaxdata');
return View::make('forms.contact')->with('data', $data);
}
I hope this helps you... This controller method just calls a view, but you can use the same method to access any controller function you might need.
This method returns no errors, and is generally much less risky than putting JS in your views, which are really meant more for page layouts and not any heavy scripting / calculation.
public function getWebServices() {
$content = View::make("_javascript.webService", $data);
return (new Response($content, 200))->header('Content-Type', "text/javascript");
}
return the above in a method of your controller
and write your javascript code in your webService view inside _javascript folder.
Instead of loading get datas via ajax, I create js blade with that specific data and base64_encode it, then in my js code, I decode and use it.

Display Cross-domain feed RSS in Wordpress site

I need to display cross-domain feeds-rss (XML format)in my site, but i get a error because ajax cross-domain call are not allowed. I've been told about json-p...anyone knows how to use or have some good tutorial?
Thanks
the simplest way is just to create an widget for wordpress or download some kind of like your requirement.
Because json-p load data in JSON format if you want to get data from JSON format then the given link will help you :
getJSON
ajax
or you can access the rss feed with php like given example :
$xml = 'http://blog.webtech11.com/feed';
$doc = new DOMDocument();
$doc->load($xml);
$item = $doc->getElementsByTagName('item');
//$data = array();
for($i=0; $i<=3; $i++){
$title = $item->item($i)->getElementsByTagName('title')->item(0)->childNodes->item(0)->nodeValue;
$link = $item->item($i)->getElementsByTagName('link')->item(0)->childNodes->item(0)->nodeValue;
echo '<h2>' . $title . '</h2>';
}
in this example i access latest 4 blog entries..
hope this will help you

Resources