Is it possible to import info from the file system and from within txt files?

Tap Forms – Organizer Database App for Mac, iPhone, and iPad Forums Script Talk Is it possible to import info from the file system and from within txt files?

Tagged: 

Viewing 16 reply threads
  • Author
    Posts
  • April 15, 2021 at 1:41 AM #44167

    MiB
    Participant

    I have posted this 5 minutes ago, but it was marked as spam and is gone… Sorry if it pos up twice later.

    Im am subscribed to Sams youtube channel on which he is explaining things within tapforms, I wouldn’t have thought of before. (If you don’t know the channel, check it out, it is called pasamio.)
    Watching this channel, tought me, that much more things are possible. So I wonder, if it is possible to automate the things I am doing over and over again.
    If so, I would be happy to pay someone, helping me with that project, because I can’t code in Java Script (or any other language than a bit html and css).

    The task:
    Filling tapforms with audiobook metadata.

    The status quo:
    A simple database with one picture and some other fields. No links or anything fancy.
    Files on a Mac. In a directory called “x” All in this format:
    Directory named: “title [author]” or just “title”
    in there:
    one or more or rarely none “.m4b” files
    one or none “info.txt” file
    one or more “.jpg” file

    What goes where in tapforms:
    The part of the directory name “title” goes to a text field.
    The part of the directory name “author” goes to a text field.
    The length of all combined m4b files goes to a time field with hours / minutes
    The content of “info.txt” goes to a notes field
    The “.jpg” file goes to a photo field

    What is “automated” now:
    I do a copy and paste of the content of the whole directory “x” in a text file. Then I replace all ” [” with a “;” and all “]” with nothing.
    Then I import the text file into tapforms, so the “title” and “author” fields are filled.

    What would be the goal:
    Getting all mentioned above automatically into tapforms.
    – title: always there, so it should be inserted into the field in tapforms
    – author: could be empty. So not even the brackets are there. Not “title []”, but “title”. If so, the field in tapforms should be empty too
    – jpg: If one jpg is in the directory, it should be in the field. If more are in the directory, non should go to tapforms (I have to choose by hand)
    – length: the length of all m4b files should be added and put into the corresponding field. If no m4b file is in the directory, the field should be empty
    – info.txt: The content of this file should go into the notes field. If there is no info.txt, it should be empty.

    Importing title and author by hand is no problem at all. Getting the time in the field is the most work. Then info.txt and lastly the jpg.
    If these three things could be automated -or all five – it would be fantastic.

    Is it possible with not so much effort?

    Attachments:
    You must be logged in to view attached files.
    April 15, 2021 at 11:05 PM #44174

    Sam Moffatt
    Participant

    You can do it with a bit of work, I have a similar sort of use case where I import copies of the YouTube videos to a document that also includes the autogenerated subtitles so that I can do some searches on them.

    It’s a bit of work to set up, basically you need a text file that lists all of your other files in it because Tap Forms can read files from disk via the Script Folder Access. Unfortunately I don’t know of a way to natively traverse directories but the work around is to use something like Utils.TextFromUrl() to get a candidate set of files.

    Here’s a copy of the script I use to import from disk:

    document.getFormNamed("Script Manager").runScriptNamed("getRecordFromFormWithKey");
    
    var thumbnail_id = 'fld-eb22e30a9e4c4d7f9997b7d3a66ffb3a';
    var title_id = 'fld-4e8e68e2979643be8417c3469015abff';
    var description_id = 'fld-429934cd6ae646b1ac1cf5ad659cb926';
    var url_id = 'fld-b46c858779094c9d906f2ce5e5c4a028';
    var upload_date_id = 'fld-fc1f8537415c4d6cabb8c0784b64f2a6';
    var thumbnail_url_id = 'fld-12b4e040711b4afea624c7c049fdd7ce';
    var subtitles_id = 'fld-05d11d1db96a420fbd66c9dd70bd3038';
    
    function Import_Entries() {
    
    	let prefix = "file:///Users/pasamio/Documents/YT Uploads/";
    	let indexFile = Utils.getTextFromUrl(prefix + "index.txt");
    	
    	if (!indexFile) {
    		console.log("No index file?");
    		return
    	}
    	
    	for (let sourceFile of indexFile.split("\n")) {
    		console.log(sourceFile);
    		let metadata = Utils.getJsonFromUrl(prefix + sourceFile);
    		console.log(JSON.stringify(metadata));
    		
    		if (!metadata) {
    			console.log("Invalid metadata file: " + sourceFile);
    			continue;
    		}
    		
    		let targetRecord = getRecordFromFormWithKey("Video List", url_id, metadata.url);
    		
    		if (!targetRecord) {
    			console.log("Failed to get target record!");
    			return;
    		}
    
    		// If the title is set, skip processing this record.
    		if (targetRecord.getFieldValue(title_id)) {
    			continue;
    		}
    		
    		targetRecord.setFieldValue(title_id, metadata.title);
    		targetRecord.setFieldValue(description_id, metadata.description);
    		targetRecord.setFieldValue(upload_date_id, metadata.uploadDate);
    		targetRecord.setFieldValue(thumbnail_url_id, metadata.previewImageURL);
    
    		let subtitlesFile = sourceFile.replace(/.json$/, '.srt');
    		targetRecord.setFieldValue(subtitles_id, Utils.getTextFromUrl(prefix + subtitlesFile));
    		targetRecord.addPhotoFromUrlToField(metadata.previewImageURL, thumbnail_id);
    	}
    	document.saveAllChanges();
    }
    
    Import_Entries();

    I have a tool that downloads the data from YouTube creating a JSON file and an SRT file with the subtitles. I then use the Terminal to create the index.txt file with this one liner:

    ls *.json > index.txt
    

    From here I can read all of the JSON files and parse them for data and then I load up the SRT file into a a notes field. At some point I plan on converting this into a video as well. Your use case is a little more complicated but I think it can be mostly done with the exception of the duration of the audio books. That might require some other tooling to do properly and maybe Tap Forms isn’t the greatest place to build that. I have some tooling that tracks my podcast times but that’s using MP3 and a little outside of Tap Forms’ native capabilities.

    Thanks for the kind words about the YouTube channel. I wanted to show some of the power of Tap Forms because it’s got a lot in it and hopefully being able to see step by step, with a few mistakes wrapped in, is useful for folks in a way written documentation doesn’t always provide. At some point I have a plan to put something like this up as a video as well.

    April 16, 2021 at 2:50 AM #44177

    MiB
    Participant

    Thank you very much for your detailed answer!
    As previously mentioned, I can’t code, so I can’t use your knowledge, flown into this script, to use it to build my own script.
    But what I have understood is this: I need a text file with all of my files I want to have in tapforms? With

    ls */ > files.txt

    all files are listed. (But of course ALL files. The ones which should not be in tapforms also.)

    And it seems, the duration is a bigger problem, so I delete this one from the wish list.
    I can easily import the title and author of as many files as I want by hand. So that would be no problem. But I guess it would be much harder to match the content of the info.txt file and the jpgs afterwards, when tapforms is already filled with the titles and the author.
    Therefore all Information should be imported at the same time, directory by directory. Right?

    April 17, 2021 at 12:43 AM #44187

    Sam Moffatt
    Participant

    In your example you have a list of folders and inside a couple of files. If those files are consistent (e.g. folder.jpg and Info.txt) then you can just assume they exist and work from there. So I’d just do ls * > files.txt to get the list of folders you have. It’ll look like your directory listing and then you can compose the paths inside of it.

    I removed the de-duplication logic to simplify so multiple runs will create duplicate entries. If this works, you can grab the Script Manager and use that to add it back (you might want to add a hidden key field as well).

    // init field ID variables
    var thumbnail_id = 'fld-changeme';
    var title_id = 'fld-changeme';
    var author_id = 'fld-changeme';
    var description_id = 'fld-changeme';
    
    function Import_Entries() {
    	// prefix is the path to where the root directory is (also link this via Script Folder Access)
    	let prefix = "file:///Users/youruser/Documents/x/";
    
    	// load up the list of the files 
    	let indexFile = Utils.getTextFromUrl(prefix + "files.txt");
    	
    	// check if we got content or not.
    	if (!indexFile) {
    		console.log("No index file?");
    		return
    	}
    	
    	// split it up line by line and process it
    	for (let sourceFile of indexFile.split("\n")) {
    		// write out the file we're processing in case something goes wrong
    		console.log(sourceFile);
    		// default the title to be the filename and author empty
    		let title = sourceFile;
    		let author = "";
    
    		// use a regexp to look for square brackets
    		let pieces = sourceFile.match(/(.*)(\[([^\]]*)\])*/);
    		if (pieces) {
    			// if we found two parts, set title/author
    			title = pieces[1];
    			author = pieces[2];
    		}
    		
    		// create a new record in this form 
    		let targetRecord = form.addNewRecord();
    
    		// set the title and author fields
    		targetRecord.setFieldValue(title_id, title);
    		targetRecord.setFieldValue(author_id, author);
    
    		// read the Info.txt file into the description note field
    		targetRecord.setFieldValue(description_id, Utils.getTextFromUrl(prefix + sourceFile + "/Info.txt"));
    
    		// add the folder.jpg file to the thumbnail field
    		targetRecord.addPhotoFromUrlToField(prefix + sourceFile + "/folder.jpg", thumbnail_id);
    	}
    	document.saveAllChanges();
    }
    
    Import_Entries();
    

    That should come close to what you need, you’ll need to change the value of the variables at the top to match your field ID’s. You can use the ID button in the script editor on the fields and it’ll generate a line for you with the field ID in it you can use. I’m not sure if the image import works from disk as I’ve not tried it but I think it should work (and if it doesn’t we ask Brendan nicely to add it). Make sure you use Script Folder Access in the Tap Forms document preferences to link the root directory otherwise you’ll get permission errors. I haven’t run this so it’s not guaranteed to run but it should come close.

    I tested the regexp against this data based on the image, it should work but might need tweaks:

    Die Welt nach der Flut [Kassandra Montag]
    Die Welt ohne Strom - 01 - - One...ond After William R. Forstchen]
    Die Welt ohne Strom - 02 - One Year After [William R. Forstchen]
    Die Welt von Arven - Das Schwert der Ahnen [Raphael Sommer]
    Die Wikinger 01
    Wie Rollo, d...er, Herzog der Normandie wurde
    Die Wikinger - 02 Björn Einars..., der Abenteurer [Hans Paulisch]
    Die widen Waldhelden Kaninchen in Not [Andrea Schütze]
    Die Wolf-Gäng - Das Hörspiel zum Kinofilm [Wolfgang Hohlbein]
    Die Wunderfabrik - 01 - Keiner...wissen! [Stefanie Gerstenberger]
    Die Wunderfabrik - 02 - Nehmt... in Acht! [Stefanie Gerstenberger]
    Die Wunderfrauen - 01 - Alles,...erz begehrt [Stephanie Schuster]
    Die zweite Braut [Sihulle Raillon]
    

    Getting the play time is again a little harder but not impossible. I’d probably use ffmpeg to grab it but that would require some stitching.

    April 18, 2021 at 4:43 PM #44200

    MiB
    Participant

    Thanks a lot for your effort!
    I fiddled around with this (again: without knowing any java script, so excuse me for asking dumb questions).

    ls * > files.txt results in a listing of all directories with an “:” after it an everything in it.
    ls > files.txt results in just the directories, but also the files.txt which is created before the txt-file is filled.
    ls -d */ > files.txt eliminates that issue but generates an “/” behind each directory.
    The solution would be: ls -d */ | sed -e 's-/$--' > files.txt

    – This line crashes tapforms:

    targetRecord.addPhotoFromUrlToField(prefix + sourceFile + "/folder.jpg", foto_id);

    But there are other issues I can’t solve:

    This part seems to work fine:

    function Script() {
    var foto_id = 'fld-38a03d032ec544f5a249080c91527ea1';
    var titel_id = 'fld-d85f5b27ea4845bcaf67296e9601d77f';
    var autor_id = 'fld-d69f7da098ab42339650e1cce7a7f2e6';
    var inhalt_id = 'fld-186f3e4f483c446884d11c771292a78f';
    
    function Import_Entries() {
    	// prefix is the path to where the root directory is (also link this via Script Folder Access)
    	let prefix = "file:///Users/mib/Downloads/x/";
    
    	// load up the list of the files 
    		let indexFile = Utils.getTextFromUrl(prefix + "files.txt");
    	
    	// check if we got content or not.
    	if (!indexFile) {
    		console.log("No index file?");
    		return
    	}

    This next one, seems to work too:

    for (let sourceFile of indexFile.split("\n")) {

    But the regxp, how you call it, isn’t.

    This:

    let pieces = sourceFile.match(/(.*)(\[([^\]]*)\])*/);
    		if (pieces) {
    			// if we found two parts, set title/author
    			titel = pieces[1];
    			autor = pieces[2];
    			//debug
    			console.log(titel);
    		}
    

    results in 4 lines (I have filled directory “x” with 4 example directories) with just the directory names unchanged.
    A console.log(autor); instead of console.log(titel); will result in 5 (!?) “undefined”.

    Three hours later reading through the basics of regex and Stackoverflow, I have fixed that:
    let pieces = sourceFile.match(/(.*)\s\[([^\]]*)\]/);
    works fine.

    ________________________________________________

    So for now, creating new records with title and author works.

    What doesn’t work:
    – One additional (empty) record is made. I guess from the “files.txt in the same directory. (Not really a problem. I just delete it.)

    – info.txt is nowhere to be seen in the records.

    – folder.jpg does not appear anywhere in the records. The line of code crashes tapforms.
    If it is changed to:
    targetRecord.addPhotoFromUrlToField(foto_id, (prefix + sourceFile + "/folder.jpg"));
    (foto_id in front of the path and brackets) it doesn’t crash, but isn’t doing anything either. _ And that’s not the order in which it should be, regarding to the manual!

    Not all jpgs are named “folder.jpg”. Would it be possible zu take just *.jpg and add it? (Of course I have to make sure, there is just one then.)

    Do you have an idea what’s wrong? (For a quick test, I have attached my files. – The database backup is under X:\)

    Attachments:
    You must be logged in to view attached files.
    April 18, 2021 at 11:13 PM #44207

    Sam Moffatt
    Participant

    Thanks a lot for your effort!
    I fiddled around with this (again: without knowing any java script, so excuse me for asking dumb questions).

    We learn by asking questions and making mistakes, if you’ve got the question there is also likely someone else like you that has a similar question. My hope as always is that this helps not only yourself but maybe someone else who finds this with similar questions.

    ls * > files.txt results in a listing of all directories with an “:” after it an everything in it.
    ls > files.txt results in just the directories, but also the files.txt which is created before the txt-file is filled.
    ls -d */ > files.txt eliminates that issue but generates an “/” behind each directory.
    The solution would be: ls -d */ | sed -e ‘s-/$–‘ > files.txt

    Makes sense, I did some playing and this might also work out for you:

    find  * -mindepth 0 -maxdepth 0 -type d
    

    It’s a single command but I’m not sure we’re gaining much at this point. I was thinking about find earlier but it has it’s own quirks but this seems stable from my limited testing. My personal use case involved just JSON files so easy enough to wildcard match on those filenames.

    The colon being appended seems weird though, I wonder if that’s some other sort of flag being introduced. I’ve not seen that before, very interesting. The colon used to be the old MacOS path separator so perhaps it’s that?

    – This line crashes tapforms:

    targetRecord.addPhotoFromUrlToField(prefix + sourceFile + “/folder.jpg”, foto_id);

    That might be a bug for @Brendan to look into then, it’s probably not handling the file:// prefix properly.

    Three hours later reading through the basics of regex and Stackoverflow, I have fixed that:

    let pieces = sourceFile.match(/(.*)\s\[([^\]]*)\]/);
    

    works fine.

    Glad to hear you figured it out, as I said it’s close but obviously not something I was able to directly test and validate so I expected some minor tweaks.

    What doesn’t work:
    – One additional (empty) record is made. I guess from the “files.txt in the same directory. (Not really a problem. I just delete it.)

    This might be that there is an empty line from the \n split, something simple that might catch that is the following:

    	for (let sourceFile of indexFile.split("\n")) {
    		if (sourceFile.length < 3) {
    			console.log("Empty source file, skipping: " + sourceFile);
    			continue;
    		}
    

    Basically if the length of the string is less than three (empty string, “.” or “..”) then it’ll skip that line. Assumption is that your folders will always be three or more characters long.

    – info.txt is nowhere to be seen in the records.

    I’d do a console.log on the return value of the Utils.getTextFromUrl(prefix + sourceFile + "/Info.txt") just to see what you’re getting back as a first step for debugging. I did some testing and it feels like maybe the spaces are being messed up by Tap Forms, perhaps another bug for Brendan.

    – folder.jpg does not appear anywhere in the records. The line of code crashes tapforms.
    If it is changed to:
    targetRecord.addPhotoFromUrlToField(foto_id, (prefix + sourceFile + “/folder.jpg”));
    (foto_id in front of the path and brackets) it doesn’t crash, but isn’t doing anything either. _ And that’s not the order in which it should be, regarding to the manual!

    The swapped order likely doesn’t crash because it can’t find the field and is probably silently failing instead of outright failing. Crashing Tap Forms probably should have generated an alert for @Brendan though in those cases if you have a crash dump I usually also reach out to support@tapforms.com with the crash dump to let him know what I was doing if I can reliably reproduce it to get it fixed. As I said this isn’t something I’ve tested personally but is something at some point I want to work towards.

    Not all jpgs are named “folder.jpg”. Would it be possible zu take just *.jpg and add it? (Of course I have to make sure, there is just one then.)

    Once the above bug gets fixed then you can use a similar sort of index file technique to generate a well known file path to import the JPEG files from. This is a little ugly but is something like what I’d do to get it to work:

    (IFS=$'\n'; rm files.txt; for FOLDER in $(find  * -mindepth 0 -maxdepth 0 -type d ); do echo "$FOLDER" >> files.txt; pushd "$FOLDER"; pwd; ls *.jpg > images.txt;  popd; done )
    

    It’s a bit ugly and intended to be used with bash. The IFS portion tells bash to use newlines to split strings so that spaces don’t mess it up. I’m going to use this one liner to also generate the files.txt as well, so we delete it before the run starts. I used the find syntax from earlier to find all directories in the current directory and the for FOLDER to loop over them. I dump the folder name into the files.txt file and change into that directory with pushd. The pwd is there to just print out the full path for debugging just in case you end up somewhere you’re not expecting. The ls is there to grab the files and then popd to go back ot the parent directory. It’s wrapped in parentheses to prevent the IFS from persisting beyond the command execution. I’d probably put it in a simple shell script for safe keeping as well.

    Then in Tap Forms you can use Utils.getTextFromUrl on that path and loop over it to add it to the photo field.

    Sounds like we’re making progress though! Now it feels like we’re hitting some bona fide TF bugs :D

    April 19, 2021 at 2:03 AM #44211

    MiB
    Participant

    I can’t thank you enough for wasting so much time on this!

    I have tested the crash further. As it seems, the problem is any blank in the pathname.

    targetRecord.addPhotoFromUrlToField(prefix + "xx“ + "/folder.jpg", foto_id);

    doesn’t crash. But this:

    targetRecord.addPhotoFromUrlToField(prefix + "x x“ + "/folder.jpg", foto_id);

    does crash again. I’ve informed Brandan. (And he already replied! WOW!)

    Your skipping of empty source files works perfectly! ^_^

    April 19, 2021 at 11:40 AM #44212

    Brendan
    Keymaster

    I’ve published a fix for the crashing issue with spaces in the URL to my beta Dropbox folder.

    Let me know if it fixes it for you.

    Thanks,

    Brendan

    April 19, 2021 at 12:03 PM #44213

    MiB
    Participant

    Thanks a lot Brendan! The issue is solved now. images are imported just the way they should. Great!

    April 20, 2021 at 1:06 AM #44217

    MiB
    Participant

    @Sam:

    Then in Tap Forms you can use Utils.getTextFromUrl on that path and loop over it to add it to the photo field.

    Theoretically this should work. images.txt files are appearing in every directory. Brendan said, the next public release will fix the issue with the notes field not getting any input from Utils.getTextFromUrl.

    April 20, 2021 at 5:36 PM #44225

    Sam Moffatt
    Participant

    Sounds good!

    One other thing I noticed you created the script as field script, I think you should recreate it as a form script. Form scripts are intended to be run on demand which fits this case more. Script fields update when a referenced field changes and are more intended to work with the individual records.

    Once Brenden’s done the public release, I’ll do a video with this. Do you mind if I use your use case and sample data as an example?

    April 21, 2021 at 4:42 AM #44229

    MiB
    Participant

    Thank you, I’ve changed that.
    Of course you can use whatever you wish from this conversation. I have learned a lot, and it was fun. I think, I should learn some java script, just for fun. :-)
    I will attach the latest files. You can use them in your video if you like.
    – Or everybody else, who is reading here. You just have to change the username “mib” to your own in the Script and in the command file.

    Attachments:
    You must be logged in to view attached files.
    April 23, 2021 at 1:09 AM #44235

    Sam Moffatt
    Participant

    I think you should learn some JavaScript as well, you’ve already started the journey, it’d be a shame to end it here! You started off in your OP not confident you could do it though now you end it with a bona fide Tap Forms bug under your belt and your first successful script!

    Humble Bundle are currently doing a sale on O’Reilly’s Head First series which includes their Head First JavaScript book along side a bunch of other books. Head First has a slightly different approach in their books and the bundle level to get the JavaScript book is $10. That book, like most Javascript resources, focuses on Javascript in web development but a lot of the language features and functionality apply to Javascript in Tap Forms as well. Also at the $10 tier is their Learn to Code book which uses Python as an introduction language and might also be helpful in learning how to code. Each programming language has their own quirks which is the fun part in writing code.

    Thanks for the clearance, I’ve added it to my backlog to add to the channel.

    April 25, 2021 at 1:55 PM #44246

    MiB
    Participant

    As I am from Germany, I will look into some German media, or I will look around in freecodecamp. This seems to be a good start.
    But you are right, I should get into this while it is hot (for me). :-P

    July 18, 2021 at 3:13 PM #44827

    MiB
    Participant

    Hi Sam,
    since today, everything is working just fine. – Thanks to the tremendous help from Brendan.

    Here is what he has written:

    One, your info.txt files should have been UTF8 encoding, but they were MacOS Roman. Tap Forms assumes that it’s fetching UTF8 encoded data, so when it encountered MacOS Roman, the conversion from binary to a string returned a null value. When I changed your info.txt files to be all UTF8 encoding, then it worked.

    Two, the URL for the photo was not generated right. You were reading the photo filename from images.txt, but you were using that value as the URL.

    If someone has the same problem I have had after knowing this, here is a solution:
    I have over 800 info.txt files in over 800 subdirectories which are NOT UTF8 encoded.
    To get them converted in one go, use this line:

    find . -type f -iname “info.txt" -exec sh -c 'iconv -f MACROMAN -t utf-8 "$1" > converted && mv converted "$1"' -- "{}" \;

    I’ve used -iname because I have info.txt (lowercase “i”) and info.txt (uppercase “i”) files.
    “converted” is only temporary to overwrite the existing file. (Otherwise iconv would work like this: info.txt > info1.txt.

    The complete code and samples are in the zip file. (Already in UTF8)
    As mentioned above: You just have to change the username “mib” to your own in the Script and in the command file.

    Attachments:
    You must be logged in to view attached files.
    July 20, 2021 at 3:49 PM #44833

    Sam Moffatt
    Participant

    Great to hear you got it all resolved and that iconv one liner is impressive! I need to get around to doing the video for this as well.

    July 20, 2021 at 8:28 PM #44835

    Brendan
    Keymaster

    It’s interesting because your original files.txt file was already UTF-8 encoded. Just the other ones weren’t.

    July 20, 2021 at 9:02 PM #44836

    Sam Moffatt
    Participant

    The terminal I think created the files.txt and the others were the legacy data structures.

    July 21, 2021 at 12:12 AM #44837

    MiB
    Participant

    The terminal I think created the files.txt and the others were the legacy data structures.

    That’s correct.
    I have changed the behavior of the text editor now, to write UTF8 files.

Viewing 16 reply threads

You must be logged in to reply to this topic.