I’m building an in-house email tracking system for our marketing efforts. I’ve figured out how to identify most email clients using the HTTP referrer, but Gmail is giving me trouble. It doesn’t send a referrer at all!
I’m looking for another way to tell when Gmail fetches the tracking pixel from my server. Here’s what I get from print_r($_SERVER):
HTTP_ACCEPT = */*
HTTP_ACCEPT_CHARSET = ISO-8859-1,utf-8;q=0.7,*;q=0.3
HTTP_ACCEPT_ENCODING = gzip,deflate,sdch
HTTP_ACCEPT_LANGUAGE = en-GB,en-US;q=0.8,en;q=0.6
HTTP_CONNECTION = keep-alive
HTTP_HOST = xx.xxx.xx.xxx
HTTP_USER_AGENT = Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.10 (KHTML, like Gecko) Chrome/8.0.552.237 Safari/534.10
REMOTE_ADDR = xx.xxx.xx.xxx
REQUEST_METHOD = GET
Can I use any of this info to spot Gmail? Or is there another trick to get the referrer? How do other email service providers figure out if Gmail was used?
I know there might be ethical concerns, but let’s skip that debate. I just want to replicate what other ESPs do without paying for their services. Any technical advice would be great. Thanks!
Having worked on similar email tracking systems, I found that detecting Gmail usage requires a nuanced approach since it lacks a referrer. In my experience, you can infer Gmail presence by checking if the IP address belongs to known Gmail fetchers, examining the user agent string for subtle hints, and looking at the request timing which often appears clustered as Gmail preloads images. Also, if available, analyzing additional email header information like the DKIM signature can provide clues. These methods are not foolproof and need regular updates as Gmail evolves over time.
From my experience developing email analytics tools, detecting Gmail can be tricky but not impossible. One approach I’ve found effective is analyzing the IP addresses of incoming requests against known ranges used by Google’s servers. You can compile a list of these IPs and check for matches. Additionally, Gmail often uses specific patterns in its User-Agent strings, which you might be able to fingerprint over time. Another technique is to look for characteristics in how Gmail loads images - they often fetch them in batches with consistent timing. While not foolproof, combining these methods can give you a reasonably accurate picture of Gmail usage without relying on referrers. Just remember to regularly update your detection logic as Gmail’s behavior evolves.
hey man, i’ve dealt with this before. gmail’s sneaky but u can catch it sometimes. try checkin the ip address against google’s known ranges. also, look at the timing of requests - gmail tends to grab stuff in chunks. user agent strings can give hints too. just keep tweakin ur system cuz gmail changes alot. good luck bro!