SRT to XML converter?

jasonadal
jasonadal Community Member Posts: 441 ♪ Opening Act ♪
Does anyone have a good resource for doing a batch conversion of SRT files to XML?

Because the SRT files created with Camtasia don't work to add captions on our LMS (Pathlore), I need to convert all of the caption files to XML. I found a website (http://tools.rodrigopolo.com/srt2xml/) that will convert them to an XML format that does work, but you have copy paste each one individually. I found a software solution that does the conversion, but when I tested, the XML output is totally different and does not work.

PS - I'm not entirely convinced that it has anything to do with the Camtasia output - the files don't look different than any other SRT file I've looked at.

Comments

  • smiller7502
    smiller7502 Community Member Posts: 181
    Jason, I don't think your question is specific enough. XML is just a markup language, so not just any XML will work for your LMS. You'll need to know the standard your LMS will accept. (FWIW, we use SRT files in Lectora all the time and they work fine in our LMS [Cornerstone], and it's not immediately clear to me why an LMS would know or care about the formatting of a closed-caption file.)
  • jasonadal
    jasonadal Community Member Posts: 441 ♪ Opening Act ♪
    Here's a sample of what did and did not work with Pathlore. You can tell the difference between the two XML files super easy, which I probably should have done before converting them. The site I posted creates the correctly formatted XML version, so worst case scenario, I have a LOT of copy/paste work to do.

    I'm totally not surprised that it works with Cornerstone - My former workplace used it and I loved it - So much easier to upload SCORM than Pathlore. I even emailed a former colleague that works there now saying how much I missed it.
  • smiller7502
    smiller7502 Community Member Posts: 181
    Well, the one that works is certainly the one I would expect to work, so that's good. I've never seen the format for the one that doesn't work before.

    The closest I can get you is https://gotranscript.com/subtitle-converter. It'll write CDATA tags and a little other junk you don't want, but a couple of search-and-replace operations would clean those up easier than copying and pasting one cut at a time.

    It strikes me that any scripting language could convert SRT to the XML format you want without much trouble. If you have a lot of need for this, I could probably whip up a Word macro in a half-hour.
  • jasonadal
    jasonadal Community Member Posts: 441 ♪ Opening Act ♪
    I was hoping to get it without the CDATA - I THINK I tried an XML with the CDATA and that test failed as well - Pathlore is not really my favorite right now.

    I do have quite a few to get converted and on a short turnaround time (around 120 or so). I might jump to the Camtasia forums to see if anyone has anything there that might help, too.

    If you have the time to put something together, I definitely appreciate it, but don't go out of your way - You've done plenty to help already :)
  • smiller7502
    smiller7502 Community Member Posts: 181
    Jason,

    Attached is a Word macro that works very well and very quickly for me, with the caveat that I don't have many samples to test with. If you need instructions on installing or running the macro, please let me know.

    When you run it, you will see a standard Open dialog box. Use it to navigate to and open the SRT file you want to convert. The macro will save an XML file with the same name (except .xml instead of .srt, of course) in the same folder.

    There is one known unknown: The SRT files I have to work with, including yours, show cut timing down to the millisecond, e.g. 00:00:01,666 The macro leaves those timings intact, but I do not know if milliseconds are valid in the XML format you need, or if they need to be separated with a period instead of a comma. If the format doesn't work, that's the most likely reason, and it wouldn't take me 5 minutes to make the macro change the comma to a period or to discard the fractions of seconds altogether if necessary.

    There may be unknown unknowns. ;-)

    Let me know how it goes!
  • jasonadal
    jasonadal Community Member Posts: 441 ♪ Opening Act ♪
    @smiller7502 - The XML I've converted do use the milliseconds and do swap the comma for a period.

    Did you miss the attachment when posting? I'm NOTORIOUS for doing that on an at least weekly basis with email, to the point it gave one of my former supervisors a chuckle when I did it.

    The site I used worked well once I got in a groove - However, I found out this morning that deleting the extra two spaces at the end cut off my last caption when I converted :( I'm debating what the best way to fix them will be - opening the SRT and copying/editing the XML or another method.
  • smiller7502
    smiller7502 Community Member Posts: 181
    @jasonADal, I actually did not miss the attachment -- but I did miss the warning telling me that .bas attachments are not allowed for security reasons. Sorry about that!

    I'm attaching a zipped version here that changes the commas to periods.

     
  • jasonadal
    jasonadal Community Member Posts: 441 ♪ Opening Act ♪
    Thanks again, Stan - I'll look at using this one for the next set of courses, as I managed to get all the SRTs converted using the site above - It's a little tedious, but once you get into a rhythm, it goes pretty quickly :)
  • smiller7502
    smiller7502 Community Member Posts: 181
    I realized this morning that I neglected to package up one of the functions the macro needs (really batting 1.000 on this one!). Here's a new zip so that it will actually work.