Get TIFF tag value (including non-ASCII characters) from TIFF images in Java 11

Get TIFF tag value (including non-ASCII characters) from TIFF images in Java 11 - utf-8

I am trying to read different tag values (like tags 259 (Compression), 33432 (Copyright), 306 (DateTime), 315 (Artist) etc.) from a TIFF image in Java. Can anyone suggest what is best way to get those values in Java 11 ?
i tried to get those values using tiffinfo commands (like "tiffinfo -c myfile.tif"). But i did not find any specific command in tiffinfo (libtiff) or any Java library which will give me the specific tag values (e.g. DateTime) of a TIFF image.
Update:
As haraldK suggested, i tried with ImageIO like following
try (ImageInputStream input = ImageIO.createImageInputStream(tiffFile)) {
ImageReader reader = ImageIO.getImageReaders(input).next(); // TODO: Handle reader not found
reader.setInput(input);
IIOMetadata metadata = reader.getImageMetadata(0);
TIFFDirectory ifd = TIFFDirectory.createFromMetadata(metadata);
TIFFField dateTime = ifd.getTIFFField(306);
String dateString = dateTime.getAsString(0);
}
But it does not give exact value of the tag. In case of non-ASCII value (ö, ü, ä etc), question marks replace the real values.
Can anyone tell me how to get the exact value (including non-ASCII) of the tag from TIFFField ?

You can use standard ImageIO, read the TIFF image metadata and get the requested values from it, like this by using some extra support classes in the JDK, starting from Java 9:
try (ImageInputStream input = ImageIO.createImageInputStream(tiffFile)) {
ImageReader reader = ImageIO.getImageReaders(input).next(); // TODO: Handle reader not found
reader.setInput(input);
IIOMetadata metadata = reader.getImageMetadata(0); // 0 is the index of first image
TIFFDirectory ifd = TIFFDirectory.createFromMetadata(metadata);
TIFFField dateTime = ifd.getTIFFField(306); // Yes, that's 3 F's...
String dateString = dateTime.getAsString(0); // TIFF dates are strings...
}
tiffFile must be a valid (existing, readable) java.io.File, java.io.RandomAccessFile or java.io.InputStream (or other supported input, this is plugin-based, really). If not, input will be null, and the code will fail.
You can use similar, but a lot more verbose version, that will work in older versions of Java, as long as you have a TIFF plugin:
try (ImageInputStream input = ImageIO.createImageInputStream(tiffFile)) {
ImageReader reader = ImageIO.getImageReaders(input).next(); // TODO: Handle reader not found
reader.setInput(input);
IIOMetadata metadata = reader.getImageMetadata(0); // 0 is the index of first image
// Get "native" TIFF metadata for first IFD
IIOMetadataNode root = metadata.getAsTree("com_sun_media_imageio_plugins_tiff_image_1.0");
Node ifd = root.getFirstChild();
NodeList fields = ifd.getElementsByTagName("TIFFField"); // Yes, that's 3 F's...
for (int i = 0; i < fields.getLength(); i++) {
Element field = (Element) fields.item(i);
if ("306".equals(field.getAttribute("number"))) {
// This is your DateTime (306) tag,
// now do something with it 😀
// ...
}
}
}
Hardly elegant code, though... The Java 9+ approach is much cleaner and easier to reason about.

Related

PDF does not use utf-8 string encoding like Go

I am working with libray (https://github.com/unidoc/unipdf) for Go to process PDF files. By using 'SetReason' method I try to set reason of signing of my pdf file.
func (_aggg *PdfSignature )SetReason (reason string ){_aggg .Reason =_gb .MakeString (reason )};
This leads to cyrillic text become unclear symbols (as shown in the picture).
unclear cyricclic symbols
original text is: "русский > Request Id = 12, Task Id = 145"
And it is all ok with cyrillic symbols in main content of PDF file. The problem is in 'Signs'('Подписи') part (as shown in the picture).
In the library there is a mention: (see 'NOTE')
// MakeString creates an PdfObjectString from a string.
// NOTE: **PDF does not use utf-8 string encoding like Go so `s` will often not be a utf-8 encoded
// string.**
func MakeString(s string) *PdfObjectString { _aaad := PdfObjectString{_gcae: s}; return &_aaad }
I want to my pdf file's 'reason' become readable cyrillic symbols,
so, is there any solutions for this ? Hope, I explained the problem ...

It should work if you use core.MakeEncodedString
https://apidocs.unidoc.io/unipdf/latest/github.com/unidoc/unipdf/v3/core/#MakeEncodedString
signature.Reason = core.MakeEncodedString("русский > Request Id = 12, Task Id = 145", true)
func MakeEncodedString(s string, utf16BE bool) *PdfObjectString
MakeEncodedString creates a PdfObjectString with encoded content, which can be either UTF-16BE or PDFDocEncoding depending on whether utf16BE is true or false respectively.
This will store the reason in UTF-16BE which is appropriate for this text.
Disclosure: I am the original developer of UniPDF.

How to save an image in a subdirectory on android Q whilst remaining backwards compatible

I'm creating a simple image editor app and therefore need to load and save image files. I'd like the saved files to appear in the gallery in a separate album. From Android API 28 to 29, there have been drastic changes to what extent an app is able to access storage. I'm able to do what I want in Android Q (API 29) but that way is not backwards compatible.
When I want to achieve the same result in lower API versions, I have so far only found way's, which require the use of deprecated code (as of API 29).
These include:
the use of the MediaStore.Images.Media.DATA column
getting the file path to the external storage via Environment.getExternalStoragePublicDirectory(...)
inserting the image directly via MediaStore.Images.Media.insertImage(...)
My question is: is it possible to implement it in such a way, so it's backwards compatible, but doesn't require deprecated code? If not, is it okay to use deprecated code in this situation or will these methods soon be deleted from the sdk? In any case it feels very bad to use deprecated methods so I'd rather not :)
This is the way I found which works with API 29:
ContentValues values = new ContentValues();
String filename = System.currentTimeMillis() + ".jpg";
values.put(MediaStore.Images.Media.TITLE, filename);
values.put(MediaStore.Images.Media.DISPLAY_NAME, filename);
values.put(MediaStore.Images.Media.MIME_TYPE, "image/jpeg");
values.put(MediaStore.Images.Media.DATE_ADDED, System.currentTimeMillis() / 1000);
values.put(MediaStore.Images.Media.DATE_TAKEN, System.currentTimeMillis());
values.put(MediaStore.Images.Media.RELATIVE_PATH, "PATH/TO/ALBUM");
getContentResolver().insert(MediaStore.Images.Media.EXTERNAL_CONTENT_URI,values);
I then use the URI returned by the insert method to save the bitmap. The Problem is that the field RELATIVE_PATH was introduced in API 29 so when I run the code on a lower version, the image is put into the "Pictures" folder and not the "PATH/TO/ALBUM" folder.

is it okay to use deprecated code in this situation or will these methods soon be deleted from the sdk?
The DATA option will not work on Android Q, as that data is not included in query() results, even if you ask for it you cannot use the paths returned by it, even if they get returned.
The Environment.getExternalStoragePublicDirectory(...) option will not work by default on Android Q, though you can add a manifest entry to re-enable it. However, that manifest entry may be removed in Android R, so unless you are short on time, I would not go this route.
AFAIK, MediaStore.Images.Media.insertImage(...) still works, even though it is deprecated.
is it possible to implement it in such a way, so it's backwards compatible, but doesn't require deprecated code?
My guess is that you will need to use two different storage strategies, one for API Level 29+ and one for older devices. I took that approach in this sample app, though there I am working with video content, not images, so insertImage() was not an option.

This is the code that works for me. This code saves an image to a subdirectory folder on your phone. It checks the android version of the phone, if its above android q, it runs the required codes and if its below, it runs the code in the else statement.
Source: https://androidnoon.com/save-file-in-android-10-and-below-using-scoped-storage-in-android-studio/
private void saveImageToStorage(Bitmap bitmap) throws IOException {
OutputStream imageOutStream;
if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.Q) {
ContentValues values = new ContentValues();
values.put(MediaStore.Images.Media.DISPLAY_NAME,
"image_screenshot.jpg");
values.put(MediaStore.Images.Media.MIME_TYPE, "image/jpeg");
values.put(MediaStore.Images.Media.RELATIVE_PATH,
Environment.DIRECTORY_PICTURES + File.pathSeparator + "AppName");
Uri uri =
getContentResolver().insert(MediaStore.Images.Media.EXTERNAL_CONTENT_URI,
values);
imageOutStream = getContentResolver().openOutputStream(uri);
} else {
String imagesDir =
Environment.getExternalStoragePublicDirectory(Environment.DIRECTORY_PICTURES). toString() + "/AppName";
File image = new File(imagesDir, "image_screenshot.jpg");
imageOutStream = new FileOutputStream(image);
}
bitmap.compress(Bitmap.CompressFormat.JPEG, 100, imageOutStream);
imageOutStream.close();
}

For old API (<29) I place an image into the external media directory and scan it via MediaScannerConnection.
Let's see my code.
This function creates an image file. Pay attention to an appName variable - it's is a name of an album in which the image will be displayed.
override fun createImageFile(appName: String): File {
val dir = File(appContext.externalMediaDirs[0], appName)
if(!dir.exists()) {
ir.mkdir()
}
return File(dir, createFileName())
}
Then, I place an image into the file, and, at last, I run a media scanner like this:
private suspend fun scanNewFile(shot: File): Uri? {
return suspendCancellableCoroutine { continuation ->
MediaScannerConnection.scanFile(
appContext,
arrayOf<String>(shot.absolutePath),
arrayOf(imageMimeType)) { _, uri -> continuation.resume(uri)
}
}
}

After some trial and error, I discovered that it is possible to use MediaStore in a backwards compatible way, such that as much code as possible is shared between the implementations for different versions. The only trick is to remember that if you use MediaColumns.DATA, you need to create the file yourself.
Let's look at the code from my project (Kotlin). This example is for saving audio, not images, but you only need to substitute MIME_TYPE and DIRECTORY_MUSIC for whatever you require.
private fun newFile(): FileDescriptor? {
// Create a file descriptor for a new recording.
val date = DateFormat.getDateTimeInstance().format(Calendar.getInstance().time)
val filename = "$date.mp3"
val values = ContentValues().apply {
put(MediaColumns.TITLE, date)
put(MediaColumns.MIME_TYPE, "audio/mp3")
// store the file in a subdirectory
if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.Q) {
put(MediaColumns.DISPLAY_NAME, filename)
put(MediaColumns.RELATIVE_PATH, saveTo)
} else {
// RELATIVE_PATH was added in Q, so work around it by using DATA and creating the file manually
#Suppress("DEPRECATION")
val music = Environment.getExternalStoragePublicDirectory(Environment.DIRECTORY_MUSIC).path
with(File("$music/P2oggle/$filename")) {
#Suppress("DEPRECATION")
put(MediaColumns.DATA, path)
parentFile!!.mkdir()
createNewFile()
}
}
}
val uri = contentResolver.insert(MediaStore.Audio.Media.EXTERNAL_CONTENT_URI, values)!!
return contentResolver.openFileDescriptor(uri, "w")?.fileDescriptor
}
On Android 10 and above, we use DISPLAY_NAME to set the filename and RELATIVE_PATH to set the subdirectory. On older versions, we use DATA and create the file (and its directory) manually. After this, the implementation for both is the same: we simply extract the file descriptor from MediaStore and return it for use.

QnA Bot Framework - How to do accents like "á"

In my qna maker knowledge base I have this:
Question:
Hello
Answer:
Hello maría
But I got this answer on the bot: Hello mar&#237a . I tried many things and there is no results.
Thanks.

you can use the below code where you are getting your response from QNA and pass it to code.
static void Main()
{
string unicodeString = "This string contains the unicode character Pi (\u03a0)";
// Create two different encodings.
Encoding ascii = Encoding.ASCII;
Encoding unicode = Encoding.Unicode;
// Convert the string into a byte array.
byte[] unicodeBytes = unicode.GetBytes(unicodeString);
// Perform the conversion from one encoding to the other.
byte[] asciiBytes = Encoding.Convert(unicode, ascii, unicodeBytes);
// Convert the new byte[] into a char[] and then into a string.
char[] asciiChars = new char[ascii.GetCharCount(asciiBytes, 0, asciiBytes.Length)];
ascii.GetChars(asciiBytes, 0, asciiBytes.Length, asciiChars, 0);
string asciiString = new string(asciiChars);
// Display the strings created before and after the conversion.
Console.WriteLine("Original string: {0}", unicodeString);
Console.WriteLine("Ascii converted string: {0}", asciiString);
}
do let me know in case you need more help

I have created a sample here:
https://github.com/FranciscoPonceGomez/FranciscoQnAccents
It works in all channels for me. You should not have any problems with accents.
Knowledge Base:
You can try it here:
https://franciscoqnaccents.azurewebsites.net/
Let me know if there is anything in the code that doesn't make sense to you.
Regards,
Francisco

RMagick - convert file to another format without saving to disk

I'm trying to convert a file to a specific format so I can to_blob it. I know the technique of saving to disk and specifying the extension to convert it to another format, like this:
img.write("another_filename.jpg")
I'd like to not have to touch the disk during the conversion.
Is there another way?

You can specify the format when calling to_blob. From the fine manual:
to_blob img.to_blob [ { optional arguments } ]-> string
[...]
No required arguments, however you can specify the image format (such as JPEG, PNG, etc.) and depth by calling the format and depth attributes, as well as other Image::Info attributes as appropriate, in a block associated with the method.
So you can say things like this:
png_bytes = img.to_blob { |attrs| attrs.format = 'PNG' }
Yes, the interface to to_blob is a bit odd but the strange interface is just part of the fun of working with ImageMagick.
You can also use the format= method before calling to_blob:
img.format = 'PNG'
png_bytes = img.to_blob

Read image IPTC data

I'm having some trouble with reading out the IPTC data of some images, the reason why I want to do this, is because my client has all the keywords already in the IPTC data and doesn't want to re-enter them on the site.
So I created this simple script to read them out:
$size = getimagesize($image, $info);
if(isset($info['APP13'])) {
$iptc = iptcparse($info['APP13']);
print '<pre>';
var_dump($iptc['2#025']);
print '</pre>';
}
This works perfectly in most cases, but it's having trouble with some images.
Notice: Undefined index: 2#025
While I can clearly see the keywords in photoshop.
Are there any decent small libraries that could read the keywords in every image? Or am I doing something wrong here?

I've seen a lot of weird IPTC problems. Could be that you have 2 APP13 segments. I noticed that, for some reasons, some JPEGs have multiple IPTC blocks. It's possibly the problem with using several photo-editing programs or some manual file manipulation.
Could be that PHP is trying to read the empty APP13 or even embedded "thumbnail metadata".
Could be also problem with segments lenght - APP13 or 8BIM have lenght marker bytes that might have wrong values.
Try HEX editor and check the file "manually".

I have found that IPTC is almost always embedded as xml using the XMP format, and is often not in the APP13 slot. You can sometimes get the IPTC info by using iptcparse($info['APP1']), but the most reliable way to get it without a third party library is to simply search through the image file from the relevant xml string (I got this from another answer, but I haven't been able to find it, otherwise I would link!):
The xml for the keywords always has the form "<dc:subject>...<rdf:Seq><rdf:li>Keyword 1</rdf:li><rdf:li>Keyword 2</rdf:li>...<rdf:li>Keyword N</rdf:li></rdf:Seq>...</dc:subject>"
So you can just get the file as a string using file_get_contents(get_attached_file($attachment_id)), use strpos() to find each opening (<rdf:li>) and closing (</rdf:li>) XML tag, and grab the keyword between them using substr().
The following snippet works for all jpegs I have tested it on. It will fill the array $keys with IPTC tags taken from an image on wordpress with id $attachment_id:
$content = file_get_contents(get_attached_file($attachment_id));
// Look for xmp data: xml tag "dc:subject" is where keywords are stored
$xmp_data_start = strpos($content, '<dc:subject>') + 12;
// Only proceed if able to find dc:subject tag
if ($xmp_data_start != FALSE) {
$xmp_data_end = strpos($content, '</dc:subject>');
$xmp_data_length = $xmp_data_end - $xmp_data_start;
$xmp_data = substr($content, $xmp_data_start, $xmp_data_length);
// Look for tag "rdf:Seq" where individual keywords are listed
$key_data_start = strpos($xmp_data, '<rdf:Seq>') + 9;
// Only proceed if able to find rdf:Seq tag
if ($key_data_start != FALSE) {
$key_data_end = strpos($xmp_data, '</rdf:Seq>');
$key_data_length = $key_data_end - $key_data_start;
$key_data = substr($xmp_data, $key_data_start, $key_data_length);
// $ctr will track position of each <rdf:li> tag, starting with first
$ctr = strpos($key_data, '<rdf:li>');
// Initialize empty array to store keywords
$keys = Array();
// While loop stores each keyword and searches for next xml keyword tag
while($ctr != FALSE && $ctr < $key_data_length) {
// Skip past the tag to get the keyword itself
$key_begin = $ctr + 8;
// Keyword ends where closing tag begins
$key_end = strpos($key_data, '</rdf:li>', $key_begin);
// Make sure keyword has a closing tag
if ($key_end == FALSE) break;
// Make sure keyword is not too long (not sure what WP can handle)
$key_length = $key_end - $key_begin;
$key_length = (100 < $key_length ? 100 : $key_length);
// Add keyword to keyword array
array_push($keys, substr($key_data, $key_begin, $key_length));
// Find next keyword open tag
$ctr = strpos($key_data, '<rdf:li>', $key_end);
}
}
}
I have this implemented in a plugin to put IPTC keywords into WP's "Description" field, which you can find here.

ExifTool is very robust if you can shell out to that (from PHP it looks like?)

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Get TIFF tag value (including non-ASCII characters) from TIFF images in Java 11 - utf-8

Related

PDF does not use utf-8 string encoding like Go

How to save an image in a subdirectory on android Q whilst remaining backwards compatible

QnA Bot Framework - How to do accents like "á"

RMagick - convert file to another format without saving to disk

Read image IPTC data

Categories

Resources