MediaWiki - How to use all images from a Category in a gallery - media

I have set a description on the file:
{{Information
|description = A cheeky description
}}
I have tried to use this CategoryGallery successfully, but I cannot get the descriptions to work:
I have also used the required extra extension, they talk about short_summary, however this does not exist as far as i can see in Information template
<catgallery cat="Aubry" bpdcaption="short_summary" />
So how do I use category images in a gallery with MediaWiki?

Cargo may be overkill for this (you didn't mention that you are saving any metadata for all the images).
I personally uses DPL, which allows you do to some cool tricks with categories, you can check the manual, but as for your case:
{{#dpl:
category=all_photos
|mode=gallery
}}
that very simple example, but you can control the output format within the query (read at the manual i've mentioned).
DPL is built for this scenarios.

If you didn't mind using a different extension, Cargo can do this pretty easily (and lots of other useful stuff as well).
In Template:Artwork do something like:
<noinclude>
{{#cargo_declare: _table = artworks
| description = Wikitext
| artist = Page
}}
</noinclude><includeonly>
{{#cargo_store: _table = artworks
| description = {{{description|}}}
| artist = {{{artist|}}}
}}
</includeonly>
; Description
: {{{description}}}
; Artist
: [[{{{artist}}}]]
And then where you want the gallery (e.g. on a page for an artist), do something like:
{{#cargo_query: tables = artworks
|fields = _pageName, description, artist
|where = artist = '{{PAGENAME}}'
|format = gallery
|caption field = description
|show filename = 0
|show dimensions = 0
|show bytes = 0
}}
This assumes that the Artwork template is used on files' pages; if you wanted a mainspace page for each artwork, you could still do something similar but would have to introduce a separate image field that points to the actual file.

With a little prep, you should be able to use '_categories' if you have set up the wiki's cargo to store categories using "$wgCargoPageDataColumns[] = 'categories';" in the LocalSettings.php
example...
{{#cargo_query:
tables=MyTable
|where=MyTable._categories HOLDS 'Foo'
|fields=MyTable._pageName
}}
The above should give the name of the files in the category 'Foo'.
To show the images, change the fields to...
|fields=CONCAT( '[[file:', MyTable._pagename, '|thumb]]' )

Related

Bibiliography style with Quarto documents

The default way of displaying references with Quarto documents seems to put author names in this format: Last1, First1, First2 Last2, First3 Last3, and First4 Last4. So the first author name is displayed differently than the rest. Is that intentional and is there a way to change that?
Here's an example:
---
project:
type: website
format: html
bibliography: references.bib
---
## Text
#bibitem1
Content of references.bib
#article{bibitem1,
author = {First1 Last1 and First2 Last2 and First3 Last3 and First4 Last4},
title = {Article title},
journal = {Journal name},
year = {2013},
volume = {3},
number = {72},
pages = {14--18}
}
which is displayed as
Last1, First1, First2 Last2, First3 Last3, and First4 Last4. 2013. “Article Title.” Journal Name 3 (72): 14–18.
How your references are displayed is entirely determined by the bibliography format you use. Quarto adopts a default one. You can specify a custom one with the csl option, specified in your YAML header as for example:
csl: biomed-central.csl
Styles for many journals are available with the Zotero project: https://www.zotero.org/styles
These styles are composed using the Citation Style Language. In principle, you could customize such a style (including the default style used by Quarto). It is not that trivial, though, and would require some effort to understand and use the language.

Count and print images from URL

This is my first time using Spark/Scala and I am lost.
I am suppose to write a program that takes in a URL and outputs the number of images and the name of the image file.
So I was able to get the image count. I am doing this all in the command prompt which is making it quite difficult to go back and edit my def without out retyping the whole thing. Is there a better alternative. It took me quite a while just to get Spark/Scala working (I would of like to u PySpark but was unable to get them to communicate)
scala> def URLcount(url : String) : String = {
| var html = scala.io.Source.fromURL(url).mkString
| var list = html.split("\n").filter(_ != "")
| val rdds = sc.parallelize(list)
| val count = rdds.filter(_.contains("img")).count()
| return("There are " + count + " images at the " + url + " site.")
| }
URLcount: (url: String)String
scala> URLcount("https://www.yahoo.com/")
res14: String = There are 9 images at the https://www.yahoo.com/ site.
So I'm assuming after I parallelize the list I should be about to apply a filter and create a list of all the strings that contain "img src"
How would I create such list and then print it line by line to display the image urls?
I don't sure it is great solution for parsing HTML via Spark. I think that Spark created for big data (while it is general purpose). I did not find any easy way to parse HTML through Spark (but I easy find it for both XML and JSON). It is mean that in this case you will print a very long string, because HTML pages are often compressed. Anyway, for this page your program will print lines like this:
<p>So I'm assuming after I parallelize the list I should be about to apply a filter and create a list of all the strings that contain "img src"
I can advice you use Jsoup:
val yahoo = Jsoup.connect("https://www.yahoo.com").get
val images = yahoo.select("img[src]")
images.forEach(println)
You can use Spark for other purposes.
PS: I found 39 image tags with src attribute on https://www.yahoo.com. It is very easy to got error if you don't use good HTML parser.
Another way: prepare your data and than use Spark.
Sorry for my English.

citation style language - extend with an additional field

I produce a bibliography with pandoc from a bibtex file. In my bibtex entries I have the location of the pdf (not an url, just a file reference in a field file). I would like to include this reference in the bibliography, but do not see how to extend the chicago-author-date.csl - I am completely new to CSL...
I assume I have to add something like
<text macro="file" prefix=". "/>
in the layout section. But how to define the macro? How is the connection between the bibtex field and the CSL achieved?
Is there somewhere a "how to" page?
Thank you for help!
An example bibtex entry is:
author = {Frank, Andrew U.},
title = {Geo-Ontologies Are Scale Dependent (abstract only)},
booktitle = {European Geosciences Union, General Assembly 2009, Session Knowledge and Ontologies},
year = {2009},
editor = {Pulkkinen, Tuija},
url = {http://publik.tuwien.ac.at/files/PubDat-175453.pdf},
file = {docs/docs4/4698_GeoOntologies_abstarct_EUG_09.pdf},
keywords = {Onto},
owner = {frank},
timestamp = {2018.11.29},
}
the file entry should be inserted in the output as a relative web reference (clickable) - in addition to the usual output from the chicago-author-data style.
I add a list of nocite to the markdown text (read in from file) and process it (in Haskell) with the API
res <- processCites' markdownText
It works ok, I miss only the file value.

Read image IPTC data

I'm having some trouble with reading out the IPTC data of some images, the reason why I want to do this, is because my client has all the keywords already in the IPTC data and doesn't want to re-enter them on the site.
So I created this simple script to read them out:
$size = getimagesize($image, $info);
if(isset($info['APP13'])) {
$iptc = iptcparse($info['APP13']);
print '<pre>';
var_dump($iptc['2#025']);
print '</pre>';
}
This works perfectly in most cases, but it's having trouble with some images.
Notice: Undefined index: 2#025
While I can clearly see the keywords in photoshop.
Are there any decent small libraries that could read the keywords in every image? Or am I doing something wrong here?
I've seen a lot of weird IPTC problems. Could be that you have 2 APP13 segments. I noticed that, for some reasons, some JPEGs have multiple IPTC blocks. It's possibly the problem with using several photo-editing programs or some manual file manipulation.
Could be that PHP is trying to read the empty APP13 or even embedded "thumbnail metadata".
Could be also problem with segments lenght - APP13 or 8BIM have lenght marker bytes that might have wrong values.
Try HEX editor and check the file "manually".
I have found that IPTC is almost always embedded as xml using the XMP format, and is often not in the APP13 slot. You can sometimes get the IPTC info by using iptcparse($info['APP1']), but the most reliable way to get it without a third party library is to simply search through the image file from the relevant xml string (I got this from another answer, but I haven't been able to find it, otherwise I would link!):
The xml for the keywords always has the form "<dc:subject>...<rdf:Seq><rdf:li>Keyword 1</rdf:li><rdf:li>Keyword 2</rdf:li>...<rdf:li>Keyword N</rdf:li></rdf:Seq>...</dc:subject>"
So you can just get the file as a string using file_get_contents(get_attached_file($attachment_id)), use strpos() to find each opening (<rdf:li>) and closing (</rdf:li>) XML tag, and grab the keyword between them using substr().
The following snippet works for all jpegs I have tested it on. It will fill the array $keys with IPTC tags taken from an image on wordpress with id $attachment_id:
$content = file_get_contents(get_attached_file($attachment_id));
// Look for xmp data: xml tag "dc:subject" is where keywords are stored
$xmp_data_start = strpos($content, '<dc:subject>') + 12;
// Only proceed if able to find dc:subject tag
if ($xmp_data_start != FALSE) {
$xmp_data_end = strpos($content, '</dc:subject>');
$xmp_data_length = $xmp_data_end - $xmp_data_start;
$xmp_data = substr($content, $xmp_data_start, $xmp_data_length);
// Look for tag "rdf:Seq" where individual keywords are listed
$key_data_start = strpos($xmp_data, '<rdf:Seq>') + 9;
// Only proceed if able to find rdf:Seq tag
if ($key_data_start != FALSE) {
$key_data_end = strpos($xmp_data, '</rdf:Seq>');
$key_data_length = $key_data_end - $key_data_start;
$key_data = substr($xmp_data, $key_data_start, $key_data_length);
// $ctr will track position of each <rdf:li> tag, starting with first
$ctr = strpos($key_data, '<rdf:li>');
// Initialize empty array to store keywords
$keys = Array();
// While loop stores each keyword and searches for next xml keyword tag
while($ctr != FALSE && $ctr < $key_data_length) {
// Skip past the tag to get the keyword itself
$key_begin = $ctr + 8;
// Keyword ends where closing tag begins
$key_end = strpos($key_data, '</rdf:li>', $key_begin);
// Make sure keyword has a closing tag
if ($key_end == FALSE) break;
// Make sure keyword is not too long (not sure what WP can handle)
$key_length = $key_end - $key_begin;
$key_length = (100 < $key_length ? 100 : $key_length);
// Add keyword to keyword array
array_push($keys, substr($key_data, $key_begin, $key_length));
// Find next keyword open tag
$ctr = strpos($key_data, '<rdf:li>', $key_end);
}
}
}
I have this implemented in a plugin to put IPTC keywords into WP's "Description" field, which you can find here.
ExifTool is very robust if you can shell out to that (from PHP it looks like?)

Pulling Images from rss/atom feeds using magpie rss

Im using php and magpie and would like a general way of detecting images in feed item. I know some websites place images within the enclosure tag, others like this images[rss] and some simply add it to description. Is there any one with a general function for detecting if rss item has image and extracting image url after its been parsed by magpie?
i think reqular expressions would be needed to extract from description but im a noob at those. Please help if you can.
I spent ages searching for a way of displaying images in RSS via Magpie myself, and in the end I had to examine the code to figure out how to get it to work.
Like you say, the reason Magpie doesn't pick up images in the element is because they are specified using the 'enclosure' tag, which is an empty tag where the information is in the attributes, e.g.
<enclosure url="http://www.mysite.com/myphoto.jpg" length="14478" type="image/jpeg" />
As a hack to get it to work quickly for me I added the following lines of code into rss_parse.inc:
function feed_start_element($p, $element, &$attrs) {
...
if ( $el == 'channel' )
{
$this->inchannel = true;
}
...
// START EDIT - add this elseif condition to the if ($el=xxx) statement.
// Checks if element is enclosure tag, and if so store the attribute values
elseif ($el == 'enclosure' ) {
if ( isset($attrs['url']) ) {
$this->current_item['enclosure_url'] = $attrs['url'];
$this->current_item['enclosure_type'] = $attrs['type'];
$this->current_item['enclosure_length'] = $attrs['length'];
}
}
// END EDIT
...
}
The url to the image is in $myRSSitem['enclosure_url'] and the size is in $myRSSitem['enclosure_length'].
Note that enclosure tags can refer to many types of media, so first check if the type is actually an image by checking $myRSSitem['enclosure_type'].
Maybe someone else has a better suggestion and I'm sure this could be done more elegantly to pick up attributes from other empty tags, but I needed a v quick fix (deadline pressures) but I hope this might help someone else in difficulty!

Resources