It’s time to take a little break from .NET and enjoy the many different programming languages you can use when integrating to the Salesforce1 Platform. But first, a public service announcement from Steve Marx:
This week a colleague of mine asked me if I could help him by writing a script to update data he has in Salesforce (okay, he asked me weeks ago, and I only got to it yesterday). In this particular case, he is tracking articles that are written by different members of our team and includes fields such as date published, published url, as well as the # of tweets, likes, and LinkedIn shares. Each week he’s able to use Salesforce to produce dashboards and reports that demonstrate how well (or poorly) articles are doing. The complication was that all the social stats (i.e. # of tweets, likes, and shares) were aggregated manually.
A perfect opportunity for scripting, no?
It’s really not that difficult. All three of these particular services provide a simple API that returns a JSON package with the counts we need. For example, let’s say I want to see counts/shares for my post announcing the Salesforce Toolkits for .NET.
For Twitter, create the URL https://cdn.api.twitter.com/1/urls/count.json?url=http://www.wadewegner.com/2014/01/announcing-the-salesforce-toolkits-for-net/ and parse the following JSON:
{
"count" : 144,
"url" : "http://www.wadewegner.com/2014/01/announcing-the-salesforce-toolkits-for-net/"
}
For Facebook, create the URL http://graph.facebook.com/http://www.wadewegner.com/2014/01/announcing-the-salesforce-toolkits-for-net/ and parse the following JSON:
{
"id": "http://www.wadewegner.com/2014/01/announcing-the-salesforce-toolkits-for-net/",
"shares": 28
}
For LinkedIn, create the URL: http://www.linkedin.com/countserv/count/share?url=http://www.wadewegner.com/2014/01/announcing-the-salesforce-toolkits-for-net/ and parse the following JSON:
IN.Tags.Share.handleCount({
"count":18,
"fCnt":"18",
"fCntPlusOne":"19",
"url":"http://www.wadewegner.com/2014/01/announcing-the-salesforce-toolkits-for-net/"
});
Pretty simple, right?
So, now the challenge is using this information to update all the records in Salesforce. Let me quickly describe to you the two approaches I took. The first was fast and simple, but long term has some complications. The second took longer to create but ultimately provides the soundest approach.
I started with this approach. Basically, logged in using the Username-Password authentication flow, grabbed my access token, performed a SOQL query to get all the records with published URLs, looped through them, and then issues an update via the REST API with the proper social counts. Pretty simple.
The problem with this approach is that, eventually, I’m going to hit my REST API limits. Not only is my query an API call, but so too is every update. The plan is for this script to run many times throughout the day. It won’t take too long for me to hit the limit.
Although it is a simple and elegant solution, ultimately it won’t work.
As you’ll see, this approach is a lot more involved. However, it’s also rock solid and designed to beyond the scenarios supported by the REST API.
The Bulk API is optimized for loading or deleting large sets of data. You can use it to query, insert, update, upsert, or delete a large number of records asynchronously by submitting batches which are processed in the background by Salesforce.
The easiest way to use Bulk API is to enable it for processing records in Data Loader using CSV files. This avoids the need to write your own client application. However, you can also send it XML, which avoids having to construct or generate CSV files; perfect, in my opinion, for our scenario.
Okay, enough talk. Let’s jump into the code.
To go really deep with the Bulk API, I recommend you review the Bulk API Developers Guide and watch the Fast Parallel Data Loading with the Bulk API presentation from Steve Bobrowski & Sean Regan.
There are, roughly, four steps involved in using the Bulk API:
Logging in.
Creating a job.
Adding a batch. (Or multiple batches.)
Closing the job (thereby starting it).
To facilitate these steps I created a file called async.py
that had the four following methods.
You can see that this method does the following:
SOAPAction
of login
.requests
post is made to the URL, passing the headers and the encoded data.Pretty simple and nothing surprising.
A few things worth noting here:
instance
which we derive from the login response. This is used to call against the correct Salesforce datacenter.sessionId
, which was returned in the login response. This value is used in the header X-SFDC-Session
to validate us against the API.operation
, i.e. insert
, update
, upsert
, and so on.object
we’re working with. For example, we’re using a custom object called Byline__c
, so that’s what we’ll pass in.contentType
values: CSV and XML. We need to specify which one we’re using.This creates the job. Next we have to add a batch.
This is likely the most important, and undocumented, step - particularly when using XML.
objects
parameter on addBatch
and pass it in. We’ll walk through this in detail shortly.jobId
that was returned from the createJob
response.Lastly we close the job.
As you can see, this is a pretty simple operation.
Before we pull it all together, let’s look at one more method I’ve created to facilitate the Bulk API operation. The reason I didn’t include this method above is you’ll see it’s very specific to the particular use case; i.e. we’re looping through a lot of data and constructing the object
data we’ll eventually pass into the addBatch
method.
Okay, there’s a bit going on here. Once you look, though, you’ll see that it’s not too complex.
simple_salesforce
that provides a set of methods for interacting with the Salesforce REST API. Yes, I’m combining SOAP and REST APIs!SOQL
query to get the records that have a published link. This is returned as a Python dictionary from the query_all
method.requests
to call each of the URLs to get the JSON responses. It’s then simple to pull out the count
or shares
.objectXml
string that is ultimately returned. It’s worth noting that the Id
value is required; everything else added to this object is updated in Salesforce.Pretty simple!
Okay, let’s pull this all together.
As you can see, we’re simply calling our methods and evaluating the responses to get data that we pass into other methods.
sessionId
from the login
. You could also grab the instance
here with a simple regex.jobId
.objectXml
data and pass it into addBatch
to create the batch.Now, let’s look at the job execution. Login into Salesforce and click Setup -> Jobs, and Bulk Data Load Jobs.
You can click the individual job to see the full details.
And that’s a wrap! I hope this helps explain the process so that, when you attempt to do something like this yourself, it won’t take you as long as it took me.