Digital Decluttering: Photo Collection Successes and Defeats (Premium)

mess of photos
Image credit: Joanna Kosinska on Unsplash

After I finally ran into a wall—really, multiple walls—with this year’s digital decluttering efforts, I scaled back my short-term goals and recalibrated. And I created a set of to-do lists so that I could keep track of what I needed to complete before I could move on to further decluttering projects. There were four to-do lists, each with specific goals:

  • Photo collection. Consolidate my OneDrive- and Google Photos-based photo collections.
  • Photo and document scans. Organize and archive the remaining 2023 photo and document scans.
  • YouTube. Clean up and organize the Thurrott.com YouTube channel and upload and publish all (300+) of my archived videos there.
  • Move my documents and pictures to new locations so Folder Backup doesn’t clutter up my documents archive and photo collection.

I made good progress on some of these to-do items.

As I documented in Following My Own OneDrive Advice (Premium), I ended up moving all of my OneDrive-based content into a new “Paul” folder in the root of OneDrive, with Apps, Documents, Music, Photos, and Videos sub-folders that are beyond the reach of Folder Backup. I was surprised by how well this worked and how easy it was.

I also completed my updating of the Thurrott.com YouTube channel, which included fixing and updating the branding; creating numerous playlists, including one for First Ring Daily; uploading, organizing, creating titles and descriptions for, and then publishing the over 300 videos in my personal archive, and many other basic cleanup and maintenance tasks.

I made some progress on the remaining 2023 photo and document scans, with over 300 photos (JPEGs) and documents (PDFs) correctly tagged with date-based meta-data, organized, and archived in multiple locations. But there is still a daunting amount of work to do here, with over 1500 files left to sort through and fix. I had hoped to finish this one before we went to Mexico, but I didn’t and then barely looked at it while we were there because I was focused on updating the Windows 11 Field Guide and my normal day-to-day work. (Those efforts were both successful, at least.)

And I worked a lot on the photo collection consolidation—before, during, and since the Mexico trip—and even wrote a lot about what I had done, though I never published anything about this because I ultimately didn’t solve the problem. But it’s possible that some of you will find this information interesting, if only as a peek at the way my brain works, both good and bad. More to the point, all of this work and some recent successes suggest that I may, in fact, solve this problem. So you can find that below.

Before getting to that, there is a related drama hanging over my head: As I wrote Friday in Digital Decluttering: Online Accounts, Again (Premium), Google threw a wrench into my already-scrambled plans by discontinuing the ability to buy “individual storage subscriptions” for Google Workspace accounts, which I had been using and relying on with my primarily online identity, [email protected]. Long story short, I spent most of Friday uploading what I hope is a fairly complete copy of my Workspace-based Google Photos collection to my Gmail-based Google Photos just in case. And I engaged in a live chat with Google Workspace customer service, which escalated my issue, and I’m still waiting to hear back.

As you may recall, I had originally planned to move all my personal stuff off of [email protected] and to my personal Gmail account to keep the work and personal data separate and associated with the right accounts, respectively. But this effort was one of the multiple walls I mentioned up top: I ran into serious issues just moving over YouTube Music-based playlists, a feature that the service should explicitly support but does not. (And third-party services are all terrible and don’t do that accurately at all.) For that and so many other reasons, I gave up on this account work to focus on those to-do items first. And I had a longer-term goal of looking at this again once I’d made some progress or completed other decluttering projects.

Regardless of what I hear from Google Workspace support, I can thank this storage crisis for introducing a bit of clarity. It occurred to me that I don’t have to move all of my personal stuff from [email protected] at once as originally planned. Instead, I can just keep using YouTube Music and YouTube with that account, and can keep my work-related documents there. But I can move my Google Photos photo collection to my Gmail account for now and just keep using that going forward. (Deleting my photos from the Workspace account should solve the storage problem there. I’ll only do that when and if I literally have to.)

And so that’s what I’m doing. I switched over my phones so that they back up to the Google Photos associated with my personal Gmail account, not my Workspace account. As noted, I uploaded my entire Takeout-based Google Photos/Workspace collection to the Gmail account on Friday. I changed the Google Photos partnership I have with my wife from Workspace to Gmail, so we can access each other’s photos. And I factory reset the two Lenovo smart displays we have and connected them to the Gmail account so they can find our photos in their new location.

That latter bit, about the smart displays, speaks to one of many reasons why its preferable as an individual to use a Gmail account instead of a Workspace account with Google services: Workspace accounts are limited in ways that Gmail accounts are not, and they get access to new Google services later or not at all. This specific case is a good example: Because I had connected the displays to my Workspace account, my wife (who has a normal Gmail account) could not speak to the displays; it would not allow anyone but me to perform certain tasks, like get reminders, find out what was next on the schedule, and so on. But because we’re both using Gmail accounts, everything works: When she talks to the displays, they know it’s her and they provide her with her information. When I talk, they know it’s me.

Put simply, it’s “better” to use a personal Gmail account with personal services. And now I’m stepping in that direction, finally. So thanks for screwing me over, Google. It’s going to work out for the best.

OK. Onto the mess.

Below, you will find a lengthy, detailed, and probably useless description of the different directions I went in trying to consolidate my downloaded Google Photos (Workspace) and synced OneDrive photo collections. Some of this work occurred before we went to Mexico City in mid-October, and some happened during the trip. I have also worked on this since we got home this past week, but because I think I’m finally heading in the right direction after spending most of Saturday on this, I’ve pulled that bit out and will (probably) use it in a future post in which I (hopefully) explain how I maybe got in front of this project.

The content below is mostly about me going down the wrong paths, trying things and failing, learning a few things, finding some useful new tools, and ultimately getting no closer to solving the problem—the problem being, again, consolidating multiple photo collections. Because one thing I learned as I did this work, sadly very late in the process, is that I had three collections to consolidate, not two. And that changes things.

Yep. It’s in there if you dare. I recommend that most of you skip right over it.

 

 

Photo collection consolidation, again

I am coming to grips with the fact that I may never fully rectify my Google Photos and OneDrive photo collections. Though this data is among my most important, it may be too time-consuming and difficult to bother.

A quick recap

As part of this year’s digital decluttering work, I intended to cleanly reorganize my online accounts between work and personal, and as part of that, I downloaded my entire 570 GB Google Photos-based collection from my Google Workspace account so that I could upload it to my personal Gmail account. In doing so, I discovered something unexpected and unwelcome: This collection was not as complete as I thought it was and did not include all my photos. This was just one of many, many issues I had in this online account process, but long story short I put a halt to that and scaled back my ambitions in this area so that I could focus on finishing up a few key decluttering tasks first.

In addition to Google Photos, I have a separate photo collection in OneDrive that consists of two parts: A 253 GB Camera roll folder that contains photos backed up from smartphones and a 175 GB Photo collection folder that contains date-based subfolders dating back my entire life, with a mix of scanned photos (an archive, essentially), photos from digital cameras, and some weird subset of newer smartphone-based and other photos.

At a high level, these two collections mirror the ages in which they were created. The OneDrive collection is like my documents archive in that it is decades old and uses a folder-based organizational structure. But the Google Photos archive was created in the smartphone era and isn’t at all organized, so it relies on search to find things.

All my photos are out there, but they’re spread between these two collections with significant overlap, and neither is a superset of the other. Some photos are only in Google Photos, and some are only in OneDrive. I would like to fix this. But it is an enormous and difficult task due to the sheer number of files, the storage it all takes up, and the logistics of figuring out how to reconcile it all. It may be impossible or at least not worth the time and effort this work will require.

Where to start?

Just figuring out where to start is difficult.

As I learned during my documents archive work, it’s next to impossible to organize a lot of files when they’re stored remotely in cloud storage or on a NAS, and my solution was to do as much as possible locally on a PC by syncing the files (from cloud storage) or copying them there (from the NAS) first. And so I decided to do the same with my photos by buying a 1 TB Samsung T7 external SSD (USB 3.2 Gen 2-based) from Amazon to which I copied by downloaded Google Photos collection. Then, I synced my OneDrive Photo collection folder to a laptop’s internal drive. This would let me compare the two locally on that laptop.

I wrote about this plan in Digital Decluttering: Taking It On the Road (Premium) and have since spent a lot of time thinking about the best way to move forward. Perhaps predictably I went down some rabbit holes in which I did a lot of work but have little real-world success to show for it.

An experiment in organizing my Google Photos collection

One idea I had was to better organize the Google Photos collection so that it matched the folder structure in the OneDrive Photo collection folder as closely as possible. As noted, this collection is huge, at 570 GB, and the download is what I’d call “lightly organized,” with top-level folders like “Photos from 1981” and the like in its root.

But each of those folders contains only raw photo files with no organization. Working with that many files is untenable and, besides, it’s the “master” offline copy of that collection and I can’t afford to screw it up. So I tested some organizational strategies on a subset of the collection by copying one of its folders—Photos from 2016—to the Desktop on one of my PCs. And then it was time to experiment.

I spent a lot of time on this and, to be fair, I did learn a lot in doing this. But the most important takeaway may have been the futility noted at the top of this article: After a lot of work, I had organized that one year’s worth (2016) of photos from Google Photos into two folders, which I called Sorted and Unsorted.

The Sorted folder contains 15.2 GB of photos with good meta-data that are organized into 296 date-based folders (2016-01-01, etc.) with 4,912 files, somewhat mirroring my OneDrive folder structures (but without the event-based naming, like 2016-01-01 –  New Years).

And the Unsorted folder is, well, unsorted. It contains an unorganized 5.98 GB dump of unsorted photos, 1,440 of them, most of which are paper photo scans without the proper meta-data. And that’s just for one year: my full Google Photos collection contains about 60 year-based folders, many of which are bigger than 2016.

How I organized that 2016 folder is perhaps a tale worth telling, though automating this task is not much closer to being resolved than it was when I started. Overall, I’d say the biggest success here—or perhaps the biggest takeaway—is that I have identified a lot of photo scans that need to be correctly tagged and then re-uploaded to Google Photos. Doing that will be on the easy-ish side, since they are organized in OneDrive by date, but will nonetheless be tedious and time-consuming because that’s what tagging photos is. And again, that’s just one year’s worth.

Because I cannot describe this organizational experiment as a success, I will be as brief as possible.

First, I copied the “Photos from 2016” folder out of the Google Photos collection that I had downloaded with Takeout to the desktop. Then, I used an application called Bulk Rename Utility (BRU, it’s free for personal use) to scan the 2016 folder and copy its contents into a separate test folder that contained date-based sub-folders of photos matching my date-based organizational scheme.

I had to create a custom renaming template for that, which required lots of testing, but in the end, I got it to work, with 6352 photos copied and organized into date-based folders.

Unfortunately, that didn’t work for the videos in there, as those files do not have the same “Date taken” meta-data, or any other meta-data that BRU would recognize. (I tried each option.) So I used PhotoMove Pro, which I had purchased previously as part of this year’s work, to perform the same magic on the 152 videos from 2016. Unfortunately, this app doesn’t support custom folder structure naming, and so I had to pick the preset that was closest to my system, which was to create folder names with underscores, like 2016_01_01.

This worked, but I still needed to merge the videos with the photos. And that meant renaming all those folder names to use dashes (“-“) instead of underscores (“_”).

Automating that required a lot of research, as I certainly didn’t want to do it manually. In the end, I figured it out using Windows PowerShell. Here’s the command, which I have to say I’m pretty proud of:

Get-ChildItem -Path “C:\Users\paul\Desktop\Test\2016\” -Directory | ForEach-Object { Rename-Item $_.FullName -NewName ($_.Name -replace ‘_’, ‘-‘) }

With that done, I simply moved the newly organized and correctly named video folders into the same folder as the photos. Voila: The semi-organized Sorted and Unsorted folders I described above were now complete.

Looking through Unsorted, I also discovered that many of the scanned photos were oriented incorrectly and would need to be rotated. It’s easy to rotate individual or event groups of photo files like that in File Explorer, but doing so for over several hundred folders would be tedious and, again, this is just one year’s worth of photos. So again I spent a lot time researching a way to automate, but this time I came up empty. I’m sure there is a way, perhaps by moving photos with a certain height or width to a temporary folder where I could select them all and bulk rotate them in one action and then move them back. But I had spent so much time on this whole thing and it was becoming clear this was a dead-end. And so I gave up on automation and just did it manually, one screen’s worth of files at a time, in File Explorer. Like a jerk. And yes, it was as tedious as it sounds.

And there’s the problem.

After all that work, all I had were sorted and unsorted folders of photos (and videos) for just a single year’s worth of photos from Google. I had done nothing to compare them to and hopefully consolidate them with what’s in OneDrive. And so the next step was to compare the two, since the folder structures in each would be easy to examine side by side. And they are, if you overlook how many folders I’d need to compare over time. But doing so yielded no obvious way forward.

All that work was (semi) pointless.

The problem

Here’s why: I switched from digital cameras to smartphones for photos in 2013. And depending on the year, those photos were all backed up to the OneDrive Camera roll folder and/or Google Photos. What I really needed to do, at least for 2016, was compare Google Photos to the OneDrive Camera roll, not to the 2016 folder in my OneDrive photo collection. That latter folder has almost nothing in it.

Realizing this mistake, I gave up temporarily. It’s hard doing that much work without any payoff, like writing all day but not publishing any of it. And so I simply copied that 2016 folder (with its Sorted and Unsorted sub-folders) to the portable SSD so I could bring it to Mexico City in October and perhaps revisit this process.

Which I did. But I’m still no closer to figuring this out.

As noted, my Camera roll folder takes up 253 GB of disk space, but in looking at it, I was reminded that this folder has evolved over the years. In the early days, Microsoft apparently didn’t foresee that its users would eventually have tens of thousands of photos, and then eventually hundreds of thousands or even millions of photos. And so it configured OneDrive on mobile to back up all the photos from a phone to the root of Camera roll. But in time—2014 or so, it looks like—it apparently started creating year-based subfolders (and inside those, month-based subfolders), introducing a bit of organization to this folder.

But it’s still a mess: My Camera roll folder has over 63,000 files in its root and then a small selection of more organized date-based folders.

2016 is one of those date-based folders, so I could experiment with that and compare its contents to my newly organized 2016 folder from Google Photos. This would be reasonably easy, I guess. But that would still account for just one year. One God-damned year. There are roughly 60 more year-based folders in both OneDrive and the Google Photos download that I would then need to compare. And apply meta-data to some subset of photos. And commingle. And re-upload to both services.

Yikes. I would really like to organize these photos. But I would also like this not to become my entire life. There must be a better way.

Another experiment

Looking at this in Mexico City, it occurred to me to fall back once again on what I learned during my recent documents archive work. Among the many things I did there was identify and eliminate duplicate files, a process that saved many 10s if not 100s of GBs of storage space. And, importantly, made it much easier to work with the underlying folders of data.

But I would have to do this a bit differently for the photos: While I had local copies of the photos in the Google Photos collection, I would need to compare those files to the photos in my OneDrive Camera roll folder (and not its Photo collection folder). And the contents of that folder are not available locally as I hadn’t synced it to a PC.

I could do that, though its size (again, 253 GB) is problematic given that I was working off laptops while in Mexico. But maybe that wasn’t necessary as the Camera roll folder is sort of available locally thanks to OneDrive Files on Demand, and so all I needed to do was compare file names and look for duplicates. This seemed like a good place to start.

So, I researched file duplication tools and came up with AllDup, which is free and open source. After installing it, I pointed it at both of the folders—my Google Photos takeout on the SSD and the unsynced Camera roll folder on a laptop—and configured the app to only look for identical file names—and let it do its thing. It finished pretty quickly given the size of the collections—about 6:30 minutes—and I was told that it had scanned over 248,000 files and had found 167,000 duplicates. 58 percent of the files thus far were duplicates.

This seemed promising.

But there is so much I need to get right here. When you think about it, I’m not really looking for duplicates, I’m looking for non-duplicates. That is, the idea is to make sure that each collection gets the files that are unique to the other. I will have to copy non-duplicate files from OneDrive into Google Photos, and vice-versa.

Second, I don’t technically have two collections, I have three: There’s Google Photos, there’s OneDrive’s Camera roll, and then there’s the OneDrive Photo collection.

Third, there will likely be duplicates inside of each of these collections. Does it make sense to de-duplicate each one, in turn, first?

Finally, I’ve only compared file names here and it’s likely that there are duplicate file names that are not, in fact, duplicate photos (or videos). Deleting data based on just file names is dumb, as I could delete the only copy I have of some number of photos.

Put simply, this test scan was just the first step, assuming this process I am embarking on makes any sense at all.

How to proceed?

I decided to start by deduplicating OneDrive: As part of my previous photo collection decluttering work, I had engaged in a bit of “doom piling” when it came to OneDrive with the theory being that it was better to have duplicates than toss out unique photos. That is, I had an incredible number of phone photo backup folders on my NAS, and after removing duplicates of those (I really am a digital pack rat), I copied the remaining phone photo backup folders into OneDrive, in the appropriate year folders in the Photo collection. Know that many of those photos were likely backed up in the Camera roll folder too.

To see what that looked like, I just did the same type of scan as before using AllDup, but I compared the OneDrive Camera roll folder to the OneDrive Photo collection folder. And this time, the search concluded in just 14 seconds, telling me that of the 143,000 files scanned, it had found about 21,000 duplicates. Which is just 14 percent of the total, lower than expected. (My guess is that I wasn’t as religious as I should have been about backing up phone photos to OneDrive at certain times.)

Comparing individual duplicates is easy enough in AllDup, but there were some obvious issues: The sheer number of files (almost 21,000) would necessitate some form of automation, and because the files I was comparing were Files on Demand stubs, each would need to download locally as I worked on them; that would be time-consuming, and I don’t have the disk space regardless.

I also realized during the scan that I’d really need to compare Google Photos to the OneDrive Photo collection in time as well. But first things first.

AllDup had identified duplicates, but I would have to actually rectify them in some way next. This would require me to figure out how to do that correctly in AllDup, and I obviously didn’t want to screw up and delete photos without being sure of what I was doing. And in keeping with my previous experiments, I felt that it made sense to start with a subset of the data, copied locally, so I could make sure the results were satisfactory. This time I decided to start with 2013, because this was the year that I switched from digital cameras to just using smartphones for photos (with the Lumia 1020).

So I copied the “Photos from 2013” folder from the Google Photos takeout to the Desktop of that laptop. Then, I synced and copied the 2013 folder from OneDrive to the Desktop as well. These folders are still pretty big—about 21.2 GB for the Google Photos version and 22.7 GB for OneDrive—but much more manageable than the full collection.

After pointing AllDup at these two local folders and letting it run a scan, I discovered that …

 

 

Discovered … what? I don’t know: That’s where I stopped writing, and I can barely remember what I found, just that it was yet another blocker, another thing getting in the way of progress. Or, better still, yet another eye-opener that this was perhaps not the right direction either. That if I was really going to solve this problem, I would need to recalibrate yet again.

And so I did. And while this will still be a bit tedious, I think I see a light at the end of this tunnel. I may, in fact, crack this nut.

More soon.

Gain unlimited access to Premium articles.

With technology shaping our everyday lives, how could we not dig deeper?

Thurrott Premium delivers an honest and thorough perspective about the technologies we use and rely on everyday. Discover deeper content as a Premium member.

Tagged with

Share post

Thurrott