1. General Objective of the Project
Automate the search for YouTube channels based on combinations of:
A List of Games (e.g., Fortnite, Valorant, etc.).
A List of Keywords (e.g., “ping,” “lag,” etc.), including terms in different languages.
Collect and store information from these channels in a database and display it on a PHP Dashboard.
Periodically update (weekly) the view metrics of the channels (last 10 videos) and record this history to track growth or decline.
Send messages (via email and social media) to each channel automatically, following a communication schedule, and notify via Slack or email when there is a response.
Provide a Dashboard that allows managing keyword lists, workers, monitoring collection status and progress, filtering and sorting data, viewing metric history, and much more.
2. Summary of How the System Should Work
PHP Dashboard:
Game Management: Add/remove games (e.g., Fortnite, Valorant, etc.).
Keyword Management: Add/remove terms (e.g., “ping” and “lag”), including in multiple languages.
Worker Configuration: Define how many collection processes will run in parallel and which tasks they perform.
Channel Listing with:
Customizable pagination (100, 200, 500, 1000 channels per page).
Filters by any column (channel name, language, game, social media, etc.).
Sorting (alphabetical order, number of views, date of the last video, growth %, etc.).
Viewing Metrics (average views, date of the last collections, growth %, etc.).
Option to view each channel’s history (views from the last 10 videos over time).
Real-time or near real-time collection status (to see if workers are running, how many combinations remain, etc.).
Data Collection (Done by Workers):
Go through each game + each keyword in each language.
Perform searches on YouTube (using a VPN by country to refine results).
Collect channels, extract information:
Channel name
Channel URL
Social media (Instagram, Discord, Facebook, TikTok, Kwai, VK, etc.)
Channel email
Channel language (when possible to detect)
Game (related to the search that led to the channel)
Date of the last posted video
Views from the last 10 videos (for weighted average calculation)
Whenever a social network isn’t found, it should be searched on Google and in the social network itself by the channel name / channel owner’s name.
Store/update in the database.
Weekly Update:
Every week, workers run again:
Search for new channels that have appeared.
Collect the views of the last 10 videos of each channel.
Calculate the average views (excluding the 2 videos with the most views and the 2 with the fewest views, taking the average of the remaining 6).
Record a new history line in the database for that channel, with the date and the average value.
Growth % Calculation:
Compare the current average with the previous collection’s average.
Display this variation in the dashboard, for example: +15%, -7%, etc.
Sending Messages and Notifications:
Use the collected social networks and email.
Contact pipeline (e.g., 1 message/week, varying text for each level of the communication schedule).
When a response is received via email or social media, notify the administrator via Slack and email, so they can resume negotiations.
3. Detailed Step-by-Step
3.1 Database Structure
You can choose any database.
3.2 PHP Dashboard
The Dashboard must allow:
Login and User Management (basic):
Secure access to information and settings.
“Games” Section:
List all registered games.
Add a new game (e.g., “Fortnite,” “Valorant”).
Edit/Delete existing games.
If a game is deleted, the workers will no longer search for that game.
“Keywords” Section:
List all registered keywords.
Add a new keyword (e.g., “ping”) and define the language (e.g., “en”) or even insert variations in multiple languages.
Edit/Delete existing keywords.
If a keyword is deleted, the workers will no longer search for that keyword.
“Workers and Collection Settings” Section:
Define the number of workers (parallel processes).
E.g., 5 workers running simultaneously, each with a subset of the combinations to optimize time.
View status (how many combinations remain, how many channels have been collected, last execution time, etc.).
“Channels” Section:
Channel list in tabular format:
Columns:
Channel Name
URL
Latest View Average
Second-to-last view average
Third-to-last view average
Growth % (comparing the latest average with the second-to-last)
Date of the Last Video
Language
Social Media
etc.
Filter/Search by any column (e.g., channel that has “Fortnite,” channel in “pt-br,” etc.).
Pagination: options for 100, 200, 500, 1000 records per page.
Sorting by columns:
Alphabetical (Channel Name)
Higher/Lower view average
Growth %, etc.
“View Details” button that can open:
Metric History (table or chart with weekly progress).
Additional data, such as social media links.
“Metric History” Section (can be embedded in “Channel Details” or a separate section):
Chart showing the evolution of average views over the weeks.
Display of each collection with date, average, % variation.
“Communication” Section:
Display the message pipeline sent to each channel (or plan this feature).
Option to see if there was a response and mark the status of each channel (e.g., “Waiting for response,” “Negotiating,” “Closed,” etc.).
Slack/email integration to notify when there is a response.
Communication schedule with the message to be sent at each step of the sequence (first message, second message, third, etc.) using the language the channel uses (one message at each level of the schedule per language).
3.3 Data Collection Logic (Example with Fortnite and Valorant, and the words “ping” and “lag”)
Combination of Games and Words:
Games List (example): [Fortnite, Valorant]
Words List (example):
“ping” (language: “en”)
“lag” (language: “en”)
Combinations:
“Fortnite + ping”
“Fortnite + lag”
“Valorant + ping”
“Valorant + lag”
In a real project, this expands with all the words in various languages, generating hundreds or thousands of combinations.
Searching on YouTube:
Each Worker takes a combination, for example “Fortnite + ping,” and searches on YouTube.
Language and IP: change the YouTube interface language to “en” (English) and use a VPN from some US location.
Collect the channels that appear in the results (pages 1, 2, 3, … up to the last page shown by the search).
For each channel found:
Name, channel URL.
Access the channel’s “About” page to try to extract email and/or social media links.
If not found, use Google (optional) to search: “Channel X Instagram,” “Channel X Twitter,” etc.
Go into the social network itself and search as well.
Detect language (through heuristic, channel or video location, etc., if desired).
Link the channel to the searched game (Fortnite, in this example).
Store/update in the database.
Collecting Metrics from the Last 10 Videos:
Retrieve the list of recent videos from the channel.
Get the views for each of the last 10 videos.
Remove the 2 videos with the highest views and the 2 with the lowest views to calculate the average with the remaining 6 videos.
Weekly Update:
The script (or Worker) runs again, repeats the view collection for each registered channel.
Compares this new average with the previous average and calculates the variation (%).
Example: if the previous average was 10,000 views and now it is 12,000 views:
Growth = ((12,000 - 10,000) / 10,000) * 100 = +20%.
Dashboard Display:
In the channel list, display the Current Average and Growth (or Decline) in %.
In “Metric History,” display the history in a table or chart.
3.4 Sending Messages (Communication Schedule)
Message Template Creation:
Example: 1st message in each language, 2nd message, 3rd message, etc.
Automation:
The system (or a specific Worker) logs into the channel’s social networks (Instagram, Facebook, etc.) or uses APIs to send the message.
Or, in the case of email, sends via SMTP (e.g., PHPMailer in PHP).
Attempt Control:
Record the date and time of each sending.
Every week, if there was no response, send the next message.
Notification:
If the channel responds (via direct message, email, etc.), the system should mark that there was a response and send a Slack/email notification to the administrator.
3.5 Advanced Features / Important Considerations
VPN Session Management:
Integrate with a list of proxies/VPN and switch each search. This can be done via a script on the server or by calling VPN APIs. Preferably use NordVPN, as I already have a license and can share it.
Channel Language Detection:
Possibly through the title and description of the videos.
Collection Speed:
Can be controlled by the workers: each worker takes part of the combinations to avoid YouTube blocking.
Adjust waiting time and use captchas if necessary (using an API if needed).
Channel Deduplication:
If the same channel appears in multiple combinations (e.g., Fortnite + ping, Fortnite + lag, Valorant + ping, etc.), the system should associate that channel with the most relevant game but should not create duplicates.
Check if the channel already exists in the database using the channel_url as a unique key.
Scalability:
There may be hundreds of thousands of combinations if there are many games and many words in several languages. Plan the server/worker infrastructure (and IP/VPN handling).
User-Friendly Interface in the Dashboard:
Good usability is important for filtering and sorting data.
Allow exporting (CSV, Excel) channel lists if necessary.