telda-cron-job-scheduler
v1.0.10
Published
Telda cron-scheduler assignment
Downloads
12
Maintainers
Readme
Cron Job Scheduler Documentation
Table of content
- Motivation
- Tools used to construct the solution.
- Technical decisions made into consideration.
- Summarizing code
- Possible future improvements.
- Resources
Motivation
When building your backend server, whatever language/framework/runtime environment you used, is utilized by having a client that makes some request to a random endpoint asking for whatever the client wants, the server processes that request, and responds with what suits the user. In the end, you're doing the magic that the client wants.
But what If you want to automate some tasks for the consumer without making them make that request manually? Or do you want the server to automate that specific task without having someone tell that for him?
Real case scenarios
- For my First company, we had our business analyst who was using Power BI to maintain, filter, and analyze data through the application for a specific time, and then hand it to the accountant to make some calculations to hand it for the managers. That BA wanted to have specific data from the application downloaded in an admin dashboard in a compressed (zip) file with some files in specific formats daily at 9 AM where he can start working. As a Software Engineer, I had to run this cron job for that every day.
- For my current employer, we were building a ticketing system application where users had to reserve some tickets that have an expiry date and should be deleted from their accounts every day at midnight, cron job should automate this too.
Since it's that important, let's see how can we build such code for that!
Tools used to construct solution
Node.js
runtime environment usingTypeScript
- To ensure type safety for the developers since it is the recommended way to build any codebase. Most of the tech giants say Engineers just like to flex their skills by adding randomized types, but my codebase wasn't that huge since implementing the solution wasn't that huge, and different data flowing through my package didn't have too many types or transformations to be made. So TypeScript suits my case perfectly.
Jest
which is one of the most popular unit-testing packages.
Technical decisions made into consideration.
Before diving into anything, a small reminder, JavaScript in general is single-threaded, nonblocking I/O language, which means It has 1 main thread running, which means it has 1 main call stack, which means it can execute 1 instruction at a time.
If we want to execute some instruction that takes too much time, you don't want to block the main thread, otherwise, the server will be busy till that processing finishes, and any requests sent by the client won't get any response till the server becomes ideal.
So need some kind of concurrency for executing these cron jobs in the background without interrupting the main thread. So let's see what can we do about this.
1. Child Process
Operating system-wise, any running code/application on your machine (e.g. Google Chrome, Spotify, your developed application) are running processes on your machine at which its instructions get executed by the CPU, and the data are stored in the memory (RAM), and the process data are stored in what's called Process Control Block (PCB).
Since our main Node.js
server runs on some process, can we spawn some kind of sub-process from it where we can run our cron job on a separate context so it doesn't interrupt the server's main process. Also, they can share data using Inter-Process Communication (ICP) using several techniques like making some kind of messaging channels in the memory for data exchange.
It will work perfectly for sure, and every cron job will have its context separately to run its task without having any blocked code since every running process has its thread. But:
Caveats
You're limited in the number of cores on your machine to do so. Therefore, having cron jobs that run frequently on the server won't be a good idea, that is, some processes will be not spawned till the CPU cores are ideal to handle the queued jobs, In addition to taking much time to make the context switch between processes which cause too much overhead.
So, how about using ...
2. Threads
A process can have multiple threads running concurrently with its instructions to get executed with its memory stack.
On creating a new thread for any process. CPU will create Thread Control Block (TCB) which has less stored data, therefore less overhead happening when making tons of context switch between threads. But again:
Caveats
Threads work independently and they cannot share data. Therefore, cron jobs cannot share their states. But I neglected these disadvantages since it's really rare to have this case. Cron jobs 99.99% work independently and I didn't find any case in which this might not happen.
So, let's have this implemented!
Summarizing the code
We're storing our created scheduled tasks inside a global object in the package in a static Class member. This class is called Storage
, and since our package should only have 1 universal storage to store all the scheduled tasks. I applied the Singleton
design pattern to always secure having only and only one tasks storage.
class Storage {
scheduledTasks: Map<string, Task>
private static storage: Storage
private constructor() {
this.scheduledTasks = new Map()
}
static createStorage() {
if (!Storage.storage) {
Storage.storage = new Storage()
}
return Storage.storage
}
save(task: Task) {
this.scheduledTasks.set(task.name, task)
}
getTasks() {
return [...this.scheduledTasks.values()]
}
}
export default Storage
Creating a new scheduled task should work like this
class Task {
name: string
task: Function
frequency: Partial<TimeStamp>
interval: Partial<TimeStamp>
logs: Log[]
runningTask?: Worker
/**
* Creates a new task to execute the given function when the cron
* expression ticks.
*
* @param {string} name The name of the scheduled task.
* @param {Function} task The task to be executed.
* @param {TimeStamp} frequency When should be executed repeatedly.
* @param {TimeStamp} interval What is the maximum execution time for that task.
*/
constructor(name: string, task: Function, frequency: Partial<TimeStamp>, interval: Partial<TimeStamp>) {
this.name = name.replaceAll(' ', '-').toLowerCase()
this.task = task
this.frequency = frequency
this.interval = interval
this.logs = []
//overrides the task with the same name
store.save(this)
}
//Only call it within this.start()
private schedule() {
if (!fs.existsSync(path.join(__dirname, './tasks'))) {
fs.mkdirSync(path.join(__dirname, './tasks'))
}
const taskPath = path.join(__dirname, `./tasks/${this.name}.js`)
const transformedTask = transformTask(this.name, this.task.toString())
fs.writeFileSync(taskPath, transformedTask)
}
start() {
this.schedule()
this.runningTask = profileTask(this)
}
}
export default Task
*N.B: Here, I'm using JSDocs To document the main function for the developers using that package so they get to know how this thing works.
I'm storing here the important information for each task:
Name
Task to get executed
Frequency
Interval
Logs (Instrumented information for the task. e.g. logs, execution time, execution duration, and status of this task, whether it's)
a. SUCCESSFULL
b. FAILED
# compilation error
c. TERMINATED
# Task execution time took more than the required
d. PENDING
_N.B: Both interval
and frequency
are of type object of type Partial<TimeStamp>
so the client can have the flexibility to choose when to exactly run the task. like 30m
is not enough, what If I want it after 1d 30m 2s
? Let it be like that!
{
days: 3,
seconds: 1
}
And Inside my main file src/app.ts
import Storage from './store'
import { getTotalSeconds } from './utils/task.utils'
import Task from './task'
const store = Storage.createStorage()
/*
* Executes the given task when the cron expression ticks.
*
* @param {Task} task The task to be executed.
*/
export const schedule = (task: Task) => {
task.start()
}
const startTime = process.hrtime()
setInterval(() => {
const tasksList = store.getTasks()
tasksList.forEach((task: Task) => {
const totalSeconds = getTotalSeconds(task.frequency)
const [passedSeconds] = process.hrtime(startTime)
if (passedSeconds % totalSeconds === 0) {
setTimeout(() => task.start(), 0)
}
})
}, 1000) // Runs every second
You can use this schedule()
method and give it your created task to schedule it.
Notice we have a setInterval()
that has a callback function as an argument which gets executed every second. This callback function loops through the globally stored scheduled tasks and sees if its time has come to get executed or not. We check it here inside utils/task.utils.ts
:
import { TimeStamp } from '../types/types'
export const getTotalSeconds = (timestamp: Partial<TimeStamp>) => {
let totalSeconds = 0
for (const [key, value] of Object.entries(timestamp)) {
if (!value) {
continue
}
switch (key) {
case 'seconds':
totalSeconds += value
break
case 'minutes':
totalSeconds += value * 60
break
case 'hours':
totalSeconds += value * 60 * 60
break
case 'days':
totalSeconds += value * 60 * 60 * 24
break
}
}
return totalSeconds
}
Lastly, Inside my module /helpers/profiler.ts
which is the core of the package, It launches that task inside its thread with its execution context with some events to listen on from that instantiated worker thread like
data
whenever I like to listen to anything gets pipelined through thestdout
channel.error
If there's some kind of runtime error happened during the execution of that taskexit
when the task execution finishes
Possible improvements
Some points should be taken into consideration in my opinion:
Exact time zone for the task execution. What If I wanted my task to get executed at 9:00 ESET (+02:00 GMT)? That's not supported at that time, as It might be crucial for other people as the first case I explained for the business analyst case.
Limited Execution context. The given tasks should be only pure functions, running functions with expected side effects might not work.
That's it for working on that task. Please, let me know If you have any further questions you'd like to ask, or any other improvements you want me to work on. I hope my efforts pleased you and I'm happy to get assigned to this task which made me learn a lot.