Thanks a lot! Yep, I'm working on an infinte loading feature, just trying to find a good heuristic so it doesn't start listing irrelevant websites. Thanks for the suggestion!
Yeah, I need to curate the existing list of websites and perhaps swap the existing list for one with more popular websites. I'll definitely have a look, thanks!
That's a really interesting idea that I didn't get to experiment on, unfortunately. It's currently screenshotting each crawled website and extracting the colors from the pixel values.
One filter I added to make the palette a bit more relevant is that I try to find out which is the background color (the most used color) and only pick colors that are at least somewhat contrasting with the background (assuming that websites would follow somewhat basic color contrast rules).
When I tried extracting the palettes from CSS rules, what I ran into is that each website uses lots of vendors that render stuff (cookie consents, promotional pop-ups, etc.) and it's hard to distinguish what's rendered and what's not with pure CSS selectors, so I resorted to decomposing the website's theme from the perspective of a user. (through a screenshot)
It's a great idea nonetheless, I'm sure I would have some good results with it if I manage to filter out the noise, thanks a lot!
There are definitely some odd ones. The screenshots were done a few hours after the initial crawl, some pop-ups may have appeared on the first crawl.
The crawler also blocks image loading (to not take image colors into account because it's screenshotting the entire page and creating a frequency of each color from the screenshot, pixel-wise) so it's possible that some fallback CSS was displayed instead of the images.
I definitely have to refine it more and it's far from perfect. There are only ~25k websites indexed and I used a list of 1M most popular websites created by the developers of a Chrome extension, based on what most people were visiting.
I tried my best to sort by popularity but I'm sure that the list is not perfect. Plus, some websites are really unfriendly to crawlers and, as much as you try to hide it (by faking user agents, using realistic window dimensions, proxies, etc.), some will still detect you.
Haha, lots of great ideas! I'm definitely considering adding an option to search by color proportions, just trying to find a UI/UX that is not confusing for people.
I also have a lot more content (i.e.: the full HTML code of the website, the fonts and typography options) that I extracted for each website but it's tricky to come up with an easy to use UI/UX for this as a search filter, given that there's not many other websites that do this.
The site is looking great! I think that screenshots are a must, it's a lot easier for the end-user to preview the website's content. If you're referring to screenshots where certain colors are used, I don't think that's necessary.
The first version of it, I didn't take any screenshots and it felt very empty. It's a bit hard to manage the images and deliver them rapidly, in an optimized form, but I think it's worth it.
Oh, I was referring to calculating colors used on a website. To give you an example, we've stumbled upon pages where there's a product on the page that contained the color red, and nowhere else on the page was anything red. Our naive approach to finding avg colors was to simply take a screenshot of the page and process that image. Unfortunately, this method produces inaccurate results like red being part of the average colors when the design system has no red. I was curious if you managed to work around this issue. Happy to chat more if you're interested, you can reach me at murat[at]rapidpages.io
Ah, sorry, I understand what you mean now. Yeah, I've also filtered out images when calculating the color percentages and there are definitely some funky results.
There's definitely a middle-ground approach here, I just haven't found a good enough heuristic. It's true that images are part of the branding and they impact the color palette, like banners and logos, so they should be considered in some quantity. I thought of maybe averaging the results of image/no image but I don't have enough test data to validate it yet.
Let's chat and discuss more! I'll also add you on Twitter/X
Haha, thanks! That's exactly what I had in mind when building.
I've also collected what fonts and typography styles different websites use so I'm looking to create a visual for that as well, thanks for the feedback!