How to handle subdomains in sitemap index - sitemap

So I have site.com
also there is subdomain.site.com
I've created two separate sitemaps.xml for site.com and subdomain.site.com, let's call them sitemap_main.xml and sitemap_subdomain.xml
Now I need to combine those xml's in single xml file known as "sitemap index"
The question is how to specify the link to subdomain correctly?
Should it be:
site.com/sitemap_main.xml
site.com/sitemap_subdomain.xml
or should it be
site.com/sitemap_main.xml
subdomain.site.com/sitemap_subdomain.xml

it should be
site.com/sitemap_main.xml
site.com/sitemap_subdomain.xml

Related

Is there anyway I could count the visitors for a specific file in the public folder?

Currently I'm using laravel, and as you know there are many ways to count the visitors for a single page (hits on routes). But I want also to count the visitors for an image or a pdf file in the public folder, any ideas?
You will need to bypass the route to the file location using a proxy URL from your application. Then serve the files using the proxy URL to access the file.
For example: If you have a pdf file accessible at yourdomain.com/public/some-file.pdf create a proxy URL for the same something like yourdomain.com/proxy/some-file.pdf.
In your routes file create the corresponding route. Note the last part is dynamic and should identify your file.
Route::get('/proxy/{fileName}', [DefaultController::class, 'proxy'])
In your DefaultController.php file do the necessary logging of counts & then serve your original file as a response.
public function proxy($fileName) {
***your logic to increase visit count. db addition or something
$pathToFile = Storage::path($fileName);
return response()->file($pathToFile);
}
One of the possible ways is to have routes that return the stream of pdf/image. That way you can use the same method that you use for pages. It could be through $request->ip or something. The downside is, you have to remake all image/pdf links on your pages to redirect to route instead of asset path or whatever you use.

Is it possible to change page urls in codeigniter

Is it possible to change the collections page urls from https://www.product.com/collections/view/14e92dd34fb680e94c02e9ebd2ce36b29e92fd8a-*4*75 this like below http://www.product.com/larimar-jewelry/abril-necklace we do not want any random text in the urls. in codeigniter
You can set your alias in application/config/routes.php
$route['alias-value'] = 'controller/method';
refer documentation here
Otherwise you can do this with rewrite rule using .htaccess file.
Please refer doc

How to crawl/index the links on a single page: Google Search Appliance

Am new to the GSA and also don't have full admin access to the system so have to forward requests through to ICT Services to have changes made to our crawls and collections.
I hope someone can help with this question:
I have a single web page which has a list of links to about 180 documents (most of which are stored in the same subdirectory /docs/ which contains some 2400 documents). The rest are scattered across the site in a number of other subdirectories ie /finance/, /hr/ etc
At the moment all that happens is that I either get the single webpage indexed and none of the 180 links. Or I get the 1 page plus ALL of the 2400 documents in the /docs/ subdirectory.
I want to be able to just crawl/index this page and the 180 links and create a separate collection
Is there a simple way to do this?
Regards
Henry
Another possible solution is to use a robots.txt file to disallow crawling of the other pages you don't want. This would be a lot of work if you have to enumerate all of them though.
Your best bet is to see if there is some common URL pattern you can use to specify only the 180 pages you do want. For example, are the pages you do want all PDFs, and the other files you do not want are all some other type? If you can find something that is common for all the pages you want that isn't true for the other pages, you can use that to formulate a pattern (maybe using regex) to do what you want.
Instead of configuring the URL pattern under start urls and follow pattern,
configure the complete url. Get the 180 urls + 1 single web page url and put all 181 urls under start urls and follow pattern.By configuring complete urls, we could avoid GSA being crawling the other urls in the application as we are not keeping any common url pattern under follow urls.
Create a new collection and place all 180 doc urls + single web page
url (or generic pattern matching 181 urls) in that collection under "Include Content Matching the Following Patterns".
I assume that you do not want to index other 2400 documents on GSA.
Hope it helps.
Regards,
Mohan.
You would be better off using a meta and url feed for this.
It will allow you to control whether the GSA follows links in your 180 pages if you fed them in or whether you index your list page if you just feed that. You do this by specifying noindex or nofollow.
You'll still need to have your follow and crawl patterns and collections set up correctly but it's the easiest way to control what gets indexed.
You don't necessarily need to write code for this either, you can use curl and hand craft the xml.
The documentation is pretty good and easy to follow. Feeds Protocol Developers Guide

Can Sling mappings be restricted to requests with host header

I would like to selectively apply Sling mappings defined in sling:Mapping nodes under /etc/map.publish and can't get the behaviour I would like.
Essentially, I would like the mapping rule to trigger only when the host header matches the request.
I am currently using sling:Mapping nodes under /etc/map.publish to map resource paths to short URLs in the response.
So under /etc/map.publish/http/myapp I would have the following node:
<jcr:root ...>
jcr:primaryType="sling:Mapping"
sling:internalRedirect="/content/company/app/en"
sling:match="app.company.com
</jcr:root>
What I would like is that when a user requests:
http://app.company.com/content/company/app/en/page.html
The urls in the response (when mapped) will return in the form:
http://app.company.com/page.html
The reason for this difference in inbound and outbound urls is because I have Apache rewriting URLs for different device types.
However, when a request with a different host header arrives, such as:
http://localhost:4502/content/company/app/en/page.html
I do not want the URLs to be mapped according to that rule. Right now, it is being mapped to
http://app.company.com/page.html
It seems as though the mapping is strictly resolves the resource using considering the host/port. Then when mapping urls during output a "best match" is found and used. I would like the map() to behave like the resolve() if possible.
There are two mechanisms based on /etc/map:
URL resolver using resolver.resolve() responsible for transforming URLs like http://app.company.com/page.html into content path, eg. /content/company/app/en/page.html
Link rewriter using resolver.map() method which transforms the content and shortens all links from /content/company/app/en/page.html form in <a>, <img>, etc. to full URL. It will work only if you don't have any regular expressions in apropriate sling:match property.
You can use domain name to map/resolve content and eg. create multidomain environment, so http://app.company.com/page.html will hit one resource and http://app.company2.com/page.html will hit another.
However, you can't disable or enable link rewriter depending on the current request host. Eg. if configure mappings as above, the /content/company/app/en/page.html content path will always be shortened to http://app.company.com/page.html, no matter what host header you have in your request.
If you want to make sure your inbound request is resolved, just add a second mapping to it.
Your mapping would look like this:
<jcr:root ...>
jcr:primaryType="sling:Mapping"
sling:internalRedirect="[/content/company/app/en,/content,/]"
sling:match="app.company.com
</jcr:root>
Outbound mappings, s.a. resolver.map(), will use the first applying rule.

Replacing the "X" in this url www.websitename.com/info.php?lid=X

Help please.
I am looking for the best way to replace the "X" (number) in this url
www.websitename.com/info.php?lid=x
the "x" is a numerical value - i would like to replace the "X" with the "name" field from my database.
Is mod rewrite the way to go? I have multiple urls of the same format (different "X" value of course at the end) that i wish to change to create more friendly urls by replacing the "X" with the corresponding value from the database field "name".
If mod rewrite is the way to go can anyone help out with recommended code to go in the htaccess?
Thanks in advance.
Totally edited: My previous answer was based on a misunderstanding of what you're trying to ask.
What you are asking is to create a friendly URL system. This is covered in many tutorials -- just search for "friendly URLs" and you'll find lots of resources.
Here's a summary of how it works...
To create friendly URLs for your site, you would need something like this in .htaccess (not sure if I got the RewriteRule right because this is completely off the top of my head, so google for a full-blown tutorial to verify):
<IfModule mod_rewrite.c>
RewriteEngine on
RewriteRule /info/(.+) /info.php?name=$1
</IfModule>
This means a request to http://www.example.com/info/foo would be rewritten to http://www.example.com/info.php?name=foo.
Then you need to modify your application (in particular, the info.php file) to handle this new request format in which the name is given in the URL instead of the id.
Note that in this example, all names (e.g., "foo") must be unique. If any two items in your database have the same name, you're going to have problems. With this in mind, you might want to add a new field to your database table, which is a unique column containing a string using only alphanumeric characters and hyphens appropriate for use in a URL (this type of string is called a slug). You will basically use this slug instead of the id for database queries. Let's say you create an item named "The Discombobulator". When this item is created in your application, it should also create a slug along the lines of "the-discombobulator" and ensure it's unique. If you create a second item also called "The Discombobulator", your app might generate a slug for it like "the-discombobulator-2".
So, when someone requests http://www.example.com/info/the-discombobulator-2, mod_rewrite changes that to http://www.example.com/info.php?name=the-discombobulator-2 and hands it to your app. Your app gets the name parameter, which is "the-discombobulator-2" and looks that up in the database's slug field, and gets the matching record.
I think this is what you are looking for:
http://www.roscripts.com/Pretty_URLs_-_a_guide_to_URL_rewriting-168.html

Resources