Shipt’s ‘Effort-Based’ Pay Model Decided a 1,400-Item Order Was Worth About $30

Shipt’s ‘Effort-Based’ Pay Model Decided a 1,400-Item Order Was Worth About $30
Screenshot: Gizmodo

Target-owned personal shopping platform Shipt — which competes with the likes of Instacart — has been transitioning its markets to what it calls an “effort-based” wage system, dubbed V2. We, and the contractors who work for Shipt, have no idea how the new model determines pay. But a very unusual order sent to Gizmodo gives a good indication of what factors V2 isn’t looking at.

Gizmodo was sent screenshots by a current shopper that appeared to show an order of mostly toiletries and cleaning products to be picked up from Meijer (a sort of Midwest Walmart) in Ypsilanti, Michigan, about 16 km outside Ann Arbor. Shipt estimated that whichever shopper claimed the order would earn between $US15 ($21) and $US20 ($29) for their work.

What made this order unlike others, however, was that the customer had selected large quantities — either 100 or 200 — of each of the 12 constituent items, for a total of 1,400 items. While only 10 of the 12 items were included in the screenshots, we ran an estimate of the total cost of the order (which substitutes Walmart prices where these products were no longer listed on the Meijer site, and does not account for the markup Shipt includes.) Our calculation — which was 200 items short by default — came to over $US3,600 ($5,130).

Screenshot: Gizmodo Screenshot: Gizmodo

The chasm between the customer cost and estimated payout does a lot to illustrate the lack of fairness in the opaque V2 system, but it likely rankles veteran Shipt shoppers even more: Prior to the “effort-based” system, pay was a commission on the order total. Contractors on the platform have previously voiced concerns that the new model has led to a significant loss of income.

Shipt confirmed over email that the screenshots of both the order and estimated pay were legitimate. A spokesperson wrote that, “simply put, this wasn’t like any other order that came through our system. This order was an outlier. We have brought this to our team’s attention and they will consider these unique factors as we move forward.” It’s unlikely the Meijer in question even had this many of each item on hand, and Shipt claims the order was not shopped and eventually cancelled.

This being an obvious outlier, and one that seemingly no shopper wasted their time trying to fulfil, what does it matter? Well, as a case study it helps us to reverse-engineer V2 a bit. We can of course rule out that total order cost has any bearing on the algorithm, but we can also likely rule out that the number of items does either. The screenshots sent to Gizmodo include a second order at the same Meijer, for only 24 items with a payout of $US11 ($16)-$US15 ($21). The reduction in pay seems to correlate more closely to the time Shipt estimates the shop will take: 52 minutes vs. 70 minutes for the outlier order.

“While our pay model is proprietary, I can tell you that we do have triggers that are built in to add additional compensation for large or particularly complex orders. What was rare about this order was how few SKUs (or individual product lines) were included combined with the quantities of them,” the spokesperson added. If I had to guess, then, Shipt uses the number of SKUs — and potentially their placement within the store layout — to build a time estimate, and considers that as a primary indicator for estimated pay. “Our evolved pay model does factor in item quantity and complexity, including multiples, as a part of compensation,” Shipt further clarified. Seemingly quantity has very little impact on earnings, however.

Strangely, in support of shifting the model from commission to “effort” the same spokesperson noted, “We recognise that shopping for one order that has three high-ticket items is not going to take as much effort as shopping for another order of the same value that has 100 items.”

While the majority of customers are unlikely to dickishly request 1,400 items be delivered to their home by an app-based contractor, and Shipt claims that since V2 rolled out, “This was the only order that had more than 1,000 items,” the spokesperson also added that most orders fall under 200 items. (Perhaps I’m alone in this, but I do not find 200 to be an encouragingly low number.) Like other meal- and grocery-delivery platforms, Shipt has experienced an extraordinary surge in demand during the pandemic.

The other factor likely not accounted for in V2 is weight. Each of the 700 detergent bottles in the outlier order is about five pounds. No one in the world can lift 1,588 kg. A shopping cart experiences catastrophic failure at close to 1,400 pounds. While, again, this order is very much an outlier, and no shopper blew out their back trying to haul several lifetimes’ worth of detergent around Michigan, on a smaller scale this almost certainly happens with other heavy items — cases of water, for instance. A visit to any online forum where app workers gather is rife with shoppers complaining about the lack of accommodation for heavy orders (cases or gallon jugs of water being the most common culprit).

Asked about the inhuman feat of strength that shopping this order would have required, Shipt relented, admitting that “we understand that the effort to shop and deliver has several factors, and are also exploring how we can consider weight going forward.” Even Instacart adds a surcharge to orders weighing more than 23 kg, though its website does not say how much the charge is or if its contractors receive that surcharge in full.

The algorithm controlling the pay of thousands upon thousands of shoppers remains a black box — and unfortunately this sort of information asymmetry is the norm rather than the exception among gig economy platforms. But this example has shone a little light on things, in Shipt’s case anyway. What the company has confirmed — beyond that the number of SKUs is a factor, and seemingly the item quantity thought it does not seem to be reflected strongly in earnings — is that the V2 algorithm examines “drive times” and “the day and time of the week.” How those factors are weighted, or what other inputs V2 is checking, remain a opaque only because Shipt refuses to share them.