17 minute read
Universal File Connector
What is the Universal File Connector?
In certain instances, a Connector is not appropriate, for example, if the word-count throughput is small, or there is a requirement to roll out a solution relatively quickly, or you intend to build a connector but need something in the interim to help define the process and requirements for the full connector.
There are two main differences between a connector and the Universal File Connector:
1. The Universal File Connector is not a plug-in. It does not need to be installed on the client side.
2. The Universal File Connector monitors an endpoint that is file based, while a connector may create files from a database.
The Universal File Connector moves files from A to B, and it can determine from filenames, folder structures, and metadata what needs to be done with the files. Some of this can be preconfigured i.e. any retrieval gets translated into the same several languages every time.
Which Endpoints Does the Universal File Connector Support?
- Amazon (AWS) S3 Bucket
- GIT command
- Network File-share
- SharePoint 2010
- SharePoint 2013
- SharePoint 365 (online)
- Translation Workspace (TW)
- Download/upload files from/to a variety of data sources (full list above), for example:
- File-share (Network)
Note: The Universal File Connector does not support federated authentication which can be an issue when connecting to SP (or any) systems controlled by Microsoft for Microsoft work which use this form of authentication.
SharePoint 2010 (not fully tested)
Office/SharePoint 365 (SharePoint Online)
GIT (i.e. Github and Bitbucket – integrated and tested, but also perhaps any GIT-based systems i.e. TFS
Azure and Azure Tables
Note: The Connector uses Azure Tables in conjunction with Azure, the latter to store files (for retrieval and delivery) and the Tables as job item information (or manifest – which help the automation to pick out the appropriate files for translation and for which languages). Using this method, the Azure table can also contain status information which can be updated by the Universal File Connector i.e. In Translation, Delivered, etc.
Note: This works similarly to an Azure table. The items in Jira work as Work Items, which the Universal File Connector reads. It then processes the associated files, which are either attached or identified by a link in the work item. It then creates translation jobs in the language specified in the work item. Like an Azure Table, the Jira work item is updated by the Universal File Connector to show In Translation, Delivered, etc.
- Generic design
The Universal File Connector is designed to be generic, so that it can be used/configured/deployed with as many different clients as possible with minimal modification.
- Freeway (Freeway)/TMS
Can upload files/content directly to Lionbridge technology stack.
- Translation Workspace
TMs can be uploaded/downloaded – allowing Lionbridge to automate the sharing of Translation Memories with partner vendors.
- Supports all language pairs supported by Freeway/TMS.
- Target languages can be pre-configured or can be embedded in the filename or folder name to be determined at run time.
- Source language can be set to a default or can be configured such that the source language can be determined by the folder the content is found in i.e. \project\de-de\ - this allows the Universal File Connector to support multi-source language projects.
- Universal File Connector builds
The Universal File Connector application does not follow the traditional version release pattern, in that there is no v1.0 or v2.0 etc. – it is in constant development and there is generally only one current version available. However, older versions may still be deployed.
Note: After the Connector has been deployed and confirmed to be working properly, the Lionbridge Connector Team generally does not upgrade a customer to a newer version unless substantial new functionality has been added.
- Log files
All actions taken by the Universal File Connector application are recorded in log files – including details of what was done and when (date/time).
- Reference files
Supports the retrieval of reference files to Freeway/TMS, but the file types expected must be configured.
- Support files
- Support files can be submitted along with files to be translated and reference files. These support files may be used when it comes to rendering the files in the Online Review Tool, or they can merely provide feedback to be implemented by the translators , similar to reference files.
- These support files are not generally delivered to the client.
- Tracking progress in a SharePoint tracker, if available.
- You can track the current status of a job in a SharePoint Tracker.
- For Azure/Jira, the Connector can update items in an Azure Table or a Jira item.
- The Universal File Connector update the tracker entry with any updates that it collects,
- Asynchronous delivery, either:
- Deliver languages to customer as available (recommendation).
- Deliver all files simultaneously. This required for multi-lingual files.
- Configuration/deployment time depends on the complexity of the process to be mapped. However, the simplest process (download files from FTP, create/submit job to Freeway, upload translated files to FTP) takes several hours to set up and test, while a more complex process may take a few days or longer to achieve a similar result, for example, if complex TMS workflows must be set up for multi-lingual files.
- If the standard configuration does not fully map the required process, the Connector Team may develop a solution that addresses the requirement. However, there may be a cost, depending on the type of development required. This can be discussed/agreed before starting development.
- Several standard templates can fast track the configuration/ deployment process. Future versions may include a configuration wizard.
- Can create jobs from downloaded content and submit jobs to:
- Translation Workspace
- a location in a file-share
- The Universal File Connector usually runs in silent/automatic mode:
- where no intervention is required after initial setup and configuration, unless an exception or error occurs
- Silent mode is usually used for predictable processes, such as:
- Similar files/file types
- Stable language set
- Process/workflow does not change depending on file types
- A tool like Windows Task Scheduler runs the Universal File Connector in silent mode.
- Best practice: Monitor any automation for issues/exceptions, at least once daily.
- Universal File Connector can run in manual mode.
- Provides more flexibility to the technical team relative to silent mode.
- Enables a technical team to function seamlessly, in that one person can take over another’s job during absences, without requiring the replacement person to fully understand the process.
- Enables a team to manually download/deliver content, if required.
- Users see the status of the current jobs and react, if required.
- Users run the Universal File Connector when necessary rather based on a schedule.
- When the Universal File Connector is run in manual mode, you can enable
MultiUser, so that several people or a team can use the same Universal File Connector instance, i.e. on the same project.
- Email Notifications
- Info notifications: file receipt and delivery notifications.
- Error notifications, if the Universal File Connector experiences a problem, such as:
- Cannot connect to FTP, SFTP etc.
- Cannot connect to internal systems/file shares i.e. Freeway/TMS etc.
- Files are too large, if trying to submit large files to Freeway/TMS, which have file-size limits.
- Non-receipt-of-files notification, which are useful if you expect content to arrive regularly within a specified time window, i.e. before 2 PM every day.
- This has been modified so that if you expect two (or more) languages and only one arrives, then a notification is sent for the missing language only.
- Universal File Connector Active Notification, which is usually sent daily. This indicates that the Universal File Connector is running.
Note: If the Universal File Connector is not working, then no notification is sent.
- File agnostic (usually)
- By default, the Universal File Connector does not care what the content of the file is. It does not check the file content unless it is configured to split and merge text files, as described below.
- The Universal File Connector assumes all files it downloads are to be sent for translation unless instructed otherwise, such as certain file types are configured as reference files or support files. The Connector can determine from the filename how to route the files. A plug-in/.exe written specifically for this requirement may handle this.
- You can configure the Universal File Connector to look into the content of the file, for example, to split the file and chunk it into a specific number words per file, or into multiple files of a certain size.
- Target languages
- Languages are set up in Language Mapping.
- They can be of any format in the client system, i.e. French,
fr-fr, etc. However, all languages must be mapped to the ISO codes for country/ language codes, i.e.
ja-jpfor any files/content will be submitted to Freeway, all of which is easily done within the Universal File Connector.
- The language codes can also be mapped back to the preferred naming convention of the customer. e.g. fr-fr -> FR (provided they arrived with this format)
- There are three ways to determine the required target languages:
- If the languages will always be the same for each job, then Lionbridge can configure this in a list and only update when changes are necessary. This is configured in fromlist in Packet Language of the configuration.
- Either the filename or path to contain the target language. The Connector uses a regular expression to extract the required information.
- Language information can also be contained in the filename. For example, if you have a
home_fr-fr.htmlfile, the Connector can be configured to translate this file into French.
- Similarly, the client can set up a tree structure.
The Universal File Connector can be configured so that any files/content in a sub-folder will be sent for translation into the target language of the folder name. For example, files/content in the
ko-kr folder will be translated into Korean.
- There are three ways to use a manifest file:
- The file contains only a list of languages.
- The file contains a list of the files for translation plus a matrix of languages, which are either blank or contain an X if translation is required. The files are grouped by language. The Connector submits a job for each target language.
- The file contains a list of the files for translation, the configuration or Freeway Order Type to use, and the languages, as described above.
The client creates the manifest file and submits it with the files to be translated, as a zipped package. The Universal File Connector open the zipped file and extracts the manifest. It uses this file to assign the files to jobs and target languages.
Note about Freeway Bundles: The Universal File Connector has limited support for Freeway Bundles, so that all the files listed in a manifest file are divided into several jobs by target language, and then grouped together as one project financially, i.e. one Gemini Code per project. The support is limited because the Connector supports only one target language per Freeway Order.
Note about custom manifest files: The Universal File Connector can also handle custom manifest files, however if it does not align to the built-in manifest files, then the Lionbridge Connector Team may need to build a parser to interpret it.
- Source languages
- Most projects have only one source language. In the Universal File Connector, the regular configuration is to set up a default source language, i.e. en-us,
- It is straightforward to have one source and either:
- a fixed set of target languages detailed in a list
- to determine the target language dynamically from the filename
Note: This requires a source file for every target language, with a complex payload. There will be a separate Freeway Order for each source file.
- However for projects with multiple source languages, the Connector works as follows:
- It extracts the source language from the folder path, an internal tag in the file, or the filename.
Note: If extracting the source language from the filename, this depending on how the Connector identifies the target languages. Extracting both the source and target languages from the filename may cause confusion. Therefore, an agreed-upon format must be determined in advance, to facilitate the Connector differentiating between source and target languages.
- After identifying the source language, then the Connector determines the target languages, based on a path/filename or configuration in a list, manifest file, etc.
Recommendation: In a multi-source project, the path should specify the source language, and if the target languages are also dynamic, then they should be specified in the filenames.
If the target languages never change, then the source of extraction of the source language is more flexible.
- Universal File Connector designed to be generic
The Universal File Connector is designed to be as generic as possible, so that it can be configured and deployed for as many accounts as possible. However, it will not work in every situation. If you an issue that is not handled that you think this Connector should manage, please contact the Lionbridge Connector Team.
Additional functionality to the Universal File Connector can be added via a self-developed plug-in, such as:
- Split and merge text files, which works best with certain types of XML.
- Can split a file into smaller manageable chunks so that a job can be easily shared between several translators if there is a tight turnaround time, such as a 24-hour turnaround. You can:
- split by word count, e.g. 1000-word files, so one 14000-word text file can be split into 14 different 1000-word files
- split files by file size
- The Connector automatically merges the split files when they return from translation, and they are delivered to the client as a single file.
- Global file rename.
- Run a search-and-replace on files before or after translation.
- Decryption/Encryption – normally this is handled by TMS, however files/content can be decrypted/encrypted if required. This is not usually necessary, as critical content is usually accessible only via a client SFTP, and any downloads/uploads go directly to/from the Lionbridge systems behind the Lionbridge firewall.
Note: Plug-ins usually reduce the amount of configuration that the Universal File Connector requires, as much of the functionality is built into the code. However, this occurs at the expense of developing the plug-in. If there is a very specific process not required for any other project, a plug-in is generally the best approach.
- Reference files
- Reference files can be included with content files for retrieval. They will be parsed out of the file-set to be translated. This must be configured in advance.
- Support files will be treated as reference files, and the will appear in the Freeway and TMS as such.
- Support files
- Support files can be sent along with the translatable material and reference files (if any).
- These files can be used to perhaps help allow the translatable files be rendered in an Online Review Tool.
- These support files are usually never delivered to the client.
- Can run on a Server (recommended) or a Desktop: computer must be running all the time.
- Admin/Configuration Mode is password protected.
- Configuration file is CRC checked to ensure:
- Only authorized changes are accepted (password controlled).
- If the application is copied to another computer, settings must be updated before it will run.
- SharePoint Tracker
- Project status information can be stored in a SharePoint tracker.
- The information stored in the tracker will be the status known to the Universal File Connector.
- The Universal File Connector will update the tracker each time a new event occurs, i.e. job submitted, in translation, cancelled, completed, etc.
- How it works
- The Deployment Integrator creates the process steps (or storage areas) called Universal File Connector Dropbox – i.e. Download from Client SFTP
- An action is associated with each Universal File Connector Dropbox – there are currently four types of actions:
- FTP/SFTP: Specify the details of the client SFTP, the login credentials, host, port/domain, SFTP path where the files are, whether you should delete the files/content that has just been downloaded.
- Freeway: Specify the Freeway details, i.e. whether Freeway demo or production, the web-service credentials to direct the files/content to the correct Order Type, and by extension on to the correct TMS workflow if available.
- TMS: Specify the TMS details, project ID, config ID, credentials etc. required to submit the job to TMS.
- Plugin/Exe: Allow a plugin or executable file to add additional functionality to the Universal File Connector application, which can then be applied to the retrieval and delivery.
- Additional required configuration
- Which languages should be set up? Language selection can be dynamic, i.e. language detail can be embedded in the filename or URL. However, the Connector must know the expected languages, so that it can verify that the content in those languages was retrieved and delivered. If content is missing, the Connector writes an error to the log file.
- Language Mapping: Set up this to ensure the client languages are mapping correctly to the languages used by the translation provider, such as Lionbridge Freeway.
- Setting up email notifications and tailoring email messages to suit project/client.
- Packet/Job information – jobs entering the Universal File Connector.
- Extract from internal tag in file
- Extract from file name or path using Regex
- Name of job – you can additional identifier information
- Project Name
- Name from folder
- Details on how a packet should be formatted
- Run Exe i.e. to split the files in manageable chunks
- Deliveries – jobs returning to the client
- Delivery types
- Run Exe, i.e. to merge the files
- File-naming convention
The file name remains unchanged. However, you can add various identifiers to it, i.e. client code, language code, Universal File Connector ID, Date/Time etc. to facilitate differentiating downloads from each other and to facilitate troubleshooting if required.
- File-name/Path too long
Occasionally a filename is too long and exceeds the 250+ path length limitation. The Universal File Connector can handle this by shortening the name (requires option selection) for the translation process and then automatically lengthening the filename back to its original format before delivery
- Analysis Details:
The Universal File Connector application can retrieve the analysis details from TMS if the file/content has been submitted directly to TMS. The Universal File Connector can then format the analysis details so that they can be used as a report for the project manager or client. This may be helpful for quoting, although it’s not envisaged that the Universal File Connector will be used as a medium to review/approve quotes, just a notification service
These can be created to add specific functionality for the Connector that is likely unique to a client.
- Future Developments
- Creation of templates/Wizard – work in progress
- UI Review (ensure consistency - ongoing)
- Simplify language in UI/tool tips - ongoing
- Online Help
- Supervision – will take all status information from the log files and present it via a web interface, this will allow the PM, for example, to see the status, potentially interact with the job and may even allow for re-deliveries etc.
For Support requests please use the following email - ServiceDesk@lionbridge.com.
Universal File Connector – Case Studies
The client has its own FTP/SFTP site, and wants Lionbridge to monitor a specific location on this site several times daily for new content. Any new content is to be downloaded and submitted to the Lionbridge technology stack, i.e. Freeway and TMS for a fixed set of languages. This content comes in two different formats and will need to run against two different TMs, so it will be sent to two different workflows in TMS via Freeway.
As the turnaround times are short and as many of the steps must be automated to avoid delays, the content files in XML must be split up into 1000-word chunks (can also be split by file size if required) and then submitted to Freeway, so that multiple translators can work on the content simultaneously to ensure they are returned within 22 hours.
Once the files are translated, the automation will merge the files and upload them to the client FTP (specified location) and notify the team that the content has been delivered.
Note: As the turnaround time is short, the automation will also send out a notification at a certain specified time, say 2pm to the client to remind them that the retrieval for translation has not yet been made.
The client will use the Lionbridge FTP site for the retrieval of content/files, and it wants any content it uploads to be automatically downloaded and submitted to Freeway and/or TMS.
The client has multiple different content types, i.e. documentation and software, which they require to be treated differently, i.e. go to different workflows/translation memories. To complicate matters more, this client has multiple different source language files that must be handled. The Connector’s Multi-Source functionality accommodates this.
The client must also send reference files that can be submitted into Freeway/TMS and be associated with each language or each project.
After translation, the files will be delivered back to the Lionbridge FTP to a specified folder.
The client has agreed to deliver and retrieve content to/from the ftp.lionbridge.com ftp site. The client wants Lionbridge to monitor a specified folder there and download the files/content according to specified rules.
- Files containing the FW keyword will be sent directly to Freeway, with a Freeway Order created and submitted for each language.
- Files containing TMS in the filename will be sent directly to TMS, bypassing Freeway altogether.
- Note: This type of scenario requires developing a plug-in for the automation, because this is not out-of-the-box functionality.
The automation monitors both Freeway and TMS. When files/content are translated, the Connector downloads them and delivers them to the client.