Microsoft Azure Logic Apps: fixed the bug on the content-type of SFTP Connector

Recently I was designing a cloud solution based on the Microsoft Azure platform, when I ran into a problem.
To implement the retrieval of a file from an SFTP server, and its copy into an Azure blob storage account I decided to use an Azure Logic App. In this way, using the appropriate connectors, SFTP and Blob Storage, provided by Logic Apps services it’s easy to create a simple workflow that addresses these “standard” tasks without to write any code or implement the clients.
It is on this Logic App, and more precisely on the SFTP connector, that I found myself stuck in front of a mistake that I could not understand nor to resolve and that, in the end, turned out to be a bug known by Microsoft.

Okay, first things first.
The Logic App designed was basically supposed to do the following:

  • Being triggered by the receiving of a message containing the name of the file on a queue of an Azure Service Bus
  • Get the file via SFTP
  • Copy the file obtained from SFTP into a container of an Azure Blob Storage
  • Send a message to another Service Bus queue to notify the copy of the file has been done succesfully

Transferred files, in this case, were 7zip files and on the second queue was listening an Azure Webjob that was in charge of unzip them and appropriately processing their content.

First I tested the webjob which performs the 7zip file unzip; I uploaded manually a sample file to the Azure Storage Account with the tool “Azure Storage Explorer“. Then I sent, again manually through the other tool “Azure Service Bus Explorer“, the name of the file on the Service Bus queue on which the webjob was listening.
In this way I simulated the work which will then be made by the Azure App Logic, focusing strictly on the webjob processing logic.
Communication via bus allows to decouple the phase of retrieval and copying of the file (performed by Logic App) from the phase of its own processing (Webjob).

So, as I said, I uploaded the file to the Azure Blob Storage
azure storage explorer manual file upload

and sent the message containing the file name on the queue, through “Azure Service Bus Explorer”
azure service bus explorer send a message manually

azure service bus explorer sent message

At this point, the Webjob listening on the Service Bus queue has been triggered by the receipt of the message and it executed the appropriate function.

public class Functions
{
	// This function will get triggered/executed when a new message is written on an Azure Queue called "my-test-queue".

	public static void ProcessQueueMessage([ServiceBusTrigger("my-test-queue")] string message, TextWriter log)
	{
		
		Console.WriteLine("RECEIVED: " + message);
		Console.WriteLine("START: " + System.DateTime.Now);

		// zip extraction logic

		Console.WriteLine("Extraction OK: " + entry);

		// zip content processing logic

		Console.WriteLine("END: " + System.DateTime.Now);
	}
}

The result of file processing performed by the webjob was the following:

[11/02/2016 15:33:38 > b90717: INFO] RECEIVED: Example_File.7z
[11/02/2016 15:33:38 > b90717: INFO] START: 11/2/2016 3:33:38 PM
[11/02/2016 15:33:40 > b90717: INFO] Extraction OK: Example_File.xml
[11/02/2016 15:33:40 > b90717: INFO] END: 11/2/2016 3:33:40 PM
[11/02/2016 15:33:41 > b90717: INFO] Executed: 'Functions.ProcessQueueMessage' (Succeeded)

Good, the file was processed correctly. The webjob part, once the file had been copied and notified via message, was working well.
I moved then to focus on the other part of the process, the one that so far I had just simulated and, as stated at the beginning of article, I wanted to implement with an Azure Logic App .

The Logic App, as anticipated, was triggered by the receipt of a message on a queue, through the mechanism provided by the Service Bus connector. Once created also the SFTP connector and the one necessary to interact with the storage account, I inserted in the workflow the appropriate actions to get the file and to copy it into the storage. As last action of the workflow there was again the notification via service bus on the queue where the webjob is listening, as seen above.

The Logic App, essentially, was defined as follows:
Azure Logic App from SFTP to Blob storage

To test it, I once again used Azure Service Bus Explorer, sending on the queue where the trigger was listening a message containing the path of a file I previously uploaded on a test SFTP server.

The Logic App seemed to have successfully completed its workflow and, by the relative page on the Azure portal, all the steps were executed successfully and the file was copied into the Azure storage account, in the proper container. The step of sending the message on the queue for the webjob had been successful as well, so the processing of the file by the latter had to be started.

Azure Logic App succesful run
Having already tested previously the webjob part I was quiet sure everything went well.

Looking at the webjob logs instead I saw that it went on error and it had not procesed the 7zip file as expected. In particular, the error returned was the following:

[10/20/2016 13:16:25 > 36205b: INFO] RECEIVED: Example_File.7z
[10/20/2016 13:16:25 > 36205b: INFO] START: 10/20/2016 1:16:24 PM
[10/20/2016 13:16:25 > 36205b: INFO]   Function had errors. See Azure WebJobs SDK dashboard for details. Instance ID is 'e8dda812-dd6c-403f-981e-aa934105ff65'
[10/20/2016 13:16:25 > 36205b: INFO] Microsoft.Azure.WebJobs.Host.FunctionInvocationException: Exception while executing function: Functions.ProcessQueueMessage ---> System.ArgumentException: The stream is invalid or no corresponding signature was found.
[10/20/2016 13:16:25 > 36205b: INFO]    at SevenZip.FileChecker.CheckSignature(Stream stream, Int32& offset, Boolean& isExecutable)

Basically, that “The stream is invalid or no corresponding signature was found” clearly meant that the webjob did not recognize the file retrieved from the storage, for which he had received notification via bus, as a 7zip format file. The file was exactly the same I had previously submitted by hand using Azure Storage Explorer and that the webjob had elaborated properly, so I could not understand why now it did not like it anymore.

Doing another round by uploading the file manually everything worked properly againg and the webjob has processed the 7zip file.
At this point, again via Azure Storage Explorer, I went to see what was present in the container where I put the two files, manually and through the Logic App. I immediately noticed something strange: the two files , which were supposed to be identical since one just a renamed copy of the other, had different size and content-type in the Azure storage.
Azure Storage Explorer files with different size
The “Example_File.7z” file uploaded manually using the Azure Storage Explorer tool had content-type “application/x-7z-compressed” while the one deposited by the Logic App connector, had the content type of a generic binary file: “application/octet-stream“.
The content type, however, is actually just a kind of “label” and, if the file had been corrected, the fact of having a generic binary coding should not be a problem.
The problem was instead that the file uploaded manually had a size of 5,6KB while the other a size of 10KB.

At this point it was clear that there was some trouble in the file transfer from the SFTP folder to the Azure storage account container through the Azure Logic App .

Just in case, I also tried to programmatically set the content-type of the Blob in the webjob, before processing the file, with the instruction:

CloudBlockBlob blockBlob = inputContainer.GetBlockBlobReference(msgFilename);
blockBlob.Properties.ContentType = "application/x-7z-compressed";
blockBlob.SetProperties();

but even that has not solved because, as mentioned, the problem was not the content-type, but the content of the file itself.

Having established the fact that the problem was in the Logic App, it was necessary to figure out what could be the connector that behaved incorrectly and that is, in which step the file was manipulated and its encoding modified wrongly. At the beginning, it seemed that the problem was in the Storage Account connector and that it does not keep the original encoding when copying the content of the SFTP connector get step.
In order to dispel all the doubts I made an additional test: I added to the Logic App a step for sending an email using the Office 365 Connector immediately after the step of SFTP get and just before copying the file through the storage account connector. In this way, if the file would properly arrived as an email attachment but wrong into the storage, the problem would immediately be circumscribed to the storage connector. Otherwise, if also the attachment of the “send email” step through the Office 365 connector had been corrupted, it meant that the problem was upstream, in the SFTP connector.

The Logic App has been modified as follows:
Azure Logic App step send mail Office 365 connector

By depositing again the file in the SFTP folder that triggered the Logic App it has be run and, as expected, the new step has delivered on my mail account the message with the file attached.
The attachment was again 10KB, the same size of the file previously deposited in the storage by the Logic App. This was not a good sign, and in fact, trying to open it, I got an error message indicating that the 7zip archive was corrupted:

Azure Logic App: attachment file format error

At this point, if both the Azure Storage connector and the 365 Office had produced the same file, it meant without a doubt that the problem was upstream, in SFTP connector, and that the downstream steps found the file already mistakenly modified.

But the SFTP connector does not leave much room for maneuver, in terms of configuration or similar. The only thing I could do was make some attempt to force the content-type of the body of the Logic App step as described in this article: Logic Apps Content-Type Handling thinking that the problem could be that the file was labeled as binary but in actually not treated as such.
Switching to the “Code View” mode in the Logic App design editor I added the “@binary()” annotation to the SFTP get step and then I run again the test, but nothing has changed .

At this point, without a additional space for action on the SFTP connector, it tried to open a question on StackOverflow.

Here replied a person from Microsoft, saying that the problem was actually due to a bug on the Logic App SFTP connector, adding that they were working to fix it and that he would notify me once that the fix was released.
After three days, tagging me in the comments to the question, the same person has reported to me that the bug had been fixed and the fix deployed on all data center and asked me to let him know if the problem in my case was actually solved.

Trying to deposit again the file on the SFTP folder, it has been copied from the Logic App, this time in the correct format; the webjob at this point, when received the message, retrieved the file and, finding it in the expected encoding, successfully processed it and everything worked out perfectly.

The problem was so solved. At this point I replied to the Microsoft person, which certainly deserves a commendation for the interest and the management of the problem, with a positive feedback.

See also:

This entry was posted in $1$s. Bookmark the permalink.

One thought on “Microsoft Azure Logic Apps: fixed the bug on the content-type of SFTP Connector

  1. Pingback: Microsoft Azure Logic Apps: using the do-until construct to implement a retry policy for a step | Dede Blog

Leave a Reply

Your email address will not be published. Required fields are marked *