How to run your first batch in Openshift using cronjobs

November 20, 2022 1894 words 9 minutes

Contents

TL;DR: Cronjobs are a Kubernetes object allowing for short, scheduled tasks such as batches to be launched onto kubernetes. In this article we will go through the key commands to use them, see a simple example for file transfer, and finally some considerations regarding concrete use cases.

Let’s face it. you have a brand new application, it is stateless, embraces all the 12 factors, and even uses a lot of fancy technologies to be cloud native…

… And then the client calls, and you have to integrate a plain old file using a good old batch transfer. True story.

Don’t desperate, Openshift has your back covered!

A quick introduction to cronjobs in Openshift

Openshift proposes primitives allowing you to deploy temporary workloads that eventually complete, called Cronjobs. This seems a perfect fit for your client needs :)

Cronjobs can be used to launch scheduled batches or any other process that has a finite lifespan and need to be periodically executed. The official documentation is here (for version 4.11). On a side note, Cronjobs are also stable in upstream Kubernetes since v1.21, and the k8s relate documentation can be found here

Note

Note that cronjobs will by default use your NotTerminating quota (a.k.a the regular one) despite being timely, as they are not specifically timebound. If you specifically want your job to use Terminating quota, you have to specify a .spec.activeDeadlineSeconds entry in your yaml file ; this will limit the lifetime of all pods of the job, resulting in a failed job and pod termination if the lifetime is exceeded. You can read more on this here.

Create a Cronjob

You can easily create a new cronjob using the oc cli using the following command: oc run <cronjob name> --image=<image that contains job> --schedule="<cron schedule> --restart=<restart option> --labels parent="<a label>"

oc run hello-world --image=hello-world --schedule='*/1 * * * *' --restart=Never --labels parent="cronjobhello"

Tip

For the cron schedule, you can see this website: https://crontab.guru/

You can also use the --command parameter to override the default docker entrypoint by the command of your choice, e.g.:

oc run pi --image=perl --schedule='*/1 * * * *' --restart=OnFailure --labels parent="cronjobpi" --command -- perl -Mbignum=bpi -wle 'print bpi(2000)'

List existing Cronjobs

You can access cronjob objects as every other object in Openshift:

$ oc get cronjobs
  NAME        SCHEDULE        SUSPEND   ACTIVE    LAST SCHEDULE   AGE
  pi          */1 * * * *     False     0         1m              10m

Retrieve the status of a specific Job

cronjob object will create one or several jobs in your namespace. cronjobs three lifecycle parameters, being:

job.Spec.Completion which indicates the number of successfully terminated pods required to declare the job successful,
job.Spec.BackoffLimit which specifies the number of failed pods beyond which a job will be flagged failed,
and finally .spec.activeDeadlineSeconds, described above.

Every job can then succeed or fail, according to the following rules:

if job.Status.Succeeded >= *job.Spec.Completion {
    return "completed"
} else if job.Status.Failed >= *job.Spec.BackoffLimit {
    return "failed: reason: BackoffLimitExceeded"
} else if <elapsed> >= *job.Spec.ActiveDeadlineSeconds {
    return "failed: reason: DeadlineExceeded
}

You can list the jobs directly like this:

$ oc get jobs
  NAME            DESIRED   SUCCESSFUL   AGE
  pi-1593618780   1         1            3m
  pi-1593618840   1         1            2m
  pi-1593618900   1         1            1m
  pi-1593618960   1         1            17s

You can then inspect any job to see it’s status:

$ oc get jobs pi-1593618960
  NAME            DESIRED   SUCCESSFUL   AGE
  pi-1593618960   1         1            2m

Or more verbosely using yaml or json output:

oc get jobs pi-1593618960 -o yaml

…where you can inspect the reason for a specific failure, here because of a BackOffLimitExceeded error:

status:
  conditions:
  - lastProbeTime: 2018-04-25T22:38:34Z
    lastTransitionTime: 2018-04-25T22:38:34Z
    message: Job has reach the specified backoff limit
    reason: BackoffLimitExceeded
    status: "True"
    type: Failed

Run a Cronjob manually

After a cronjob is added, you perhaps need to test it, or maybe you need to launch it on demand for maintenance purposes.

in both cases, you can manually create a new job from an existing cronjob using this syntax: oc create job --from=cronjob/<cronjob_name> <job_name>, e.g.:

oc create job --from=cronjob/pi  pi-manual-001

Pause a Cronjob

Every Cronjob has a .spec.suspend property, that you can patch using oc or oc. This property can be true or false.

To temporarily disable a specific cronjob, use the following command: oc patch cronjob <job-name> -p '{"spec" : {"suspend" : true }}', e.g.:

oc patch cronjob pi -p '{"spec" : {"suspend" : true }}'

To reenable the cronjob, just use oc patch cronjob <job-name> -P '{"spec" : {"suspend" : false }}'.

Delete a Cronjob

To delete a cronjob, simply call oc delete cronjob <cronjob-name>, e.g.:

oc delete cronjob pi

And that’s it, you now have everything in your toolbox to dive into a real use case. Let’s see how we can put that into practice.

A simple use case: File Transfer

General approach

More than often, a batch will need to download an external file to run, or will have to send a file to an external endpoint when ending. In the following paragraph, we will show a simple way to schedule file transfer from inside our cluster, using cronjobs:

Note

Kubernetes does NOT support DAG workflows; this is a design choice (see this PR). To circumvent this limitation, you have to rely on initContainers (see here)

Warning

Not all protocols can be used. keep in mind that your pod is not visible from the external network, so active FTP transfer is not an option. As a matter of fact, neither is passive FTP transfer, as it requires a too vast range of ports to be open between the cluster and the FTP server.

An example: SFTP Transfer

SSH File Transfer Protocol, unlike FTP, creates a single secure tunnel to create the command and data connections, and behaves nicely through firewalls.

Note

Do not confound SFTP (SSH File Transfer Protocol, as described in RFC4253 and later) with SFTP (Simple File Transfer Protocol, as described in RFC913, note the HISTORIC status) or FTPS (File Transfer Protocol over SSL, as described in RFC4217). The former is intrinsically insecure and pretty jurassic, while the later won’t help you much going through firewalls…

Let’s have a look at the process that we will put in place:

To do this, we will rely on a simple yaml file :

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: sftp-storage
  labels:
    application: sftp-job
spec:
  # use whatever storageClass is available in your cluster
  storageClassName: nfs
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 100Mi
---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: sftp
  labels:
    application: sftp-job
spec:
  schedule: '*/1 * * * *'
  jobTemplate:
    spec:
      template:
        metadata:
          labels:          
            parent: "cronjobsftp"
            application: sftp-job
        spec:
          initContainers:
            - name: sftp-transfer
              image: centos/httpd-24-centos7
              command: ['sh', '-c', 'curl -o /usr/data/readme.txt --create-dirs sftp://test.rebex.net:22/readme.txt -u demo:password -vk']
              # use proxy tunneling, verbose output and insecure connection
              volumeMounts:
              - mountPath: /usr/data
                name: sftp-volume
          containers:
            - name: sftp-job
              image: centos/httpd-24-centos7
              command: ['sh', '-c', 'cat /usr/data/readme.txt']
              volumeMounts:
                - mountPath: /usr/data
                  name: sftp-volume
          restartPolicy: OnFailure
          volumes:
          - name: sftp-volume
            persistentVolumeClaim:
              claimName: sftp-storage

Some explanations will get you started quickly:

We use the same volume for both the init container and the main container, so that the downloaded file is visible from both sides. For this, we rely on a persistentvolumeclaim, which is created before.
We use cURL command to retrieve the file. Several options are to consider depending on your situation. Let’s have a deeper look at what’s going on there :
- sh -c is used because cURL interactive authentication depends on being executed inside a shell session. You can bypass this if you use a client certificate;
- -o /usr/data/readme.txt --create-dirs outputs the result of cURL inside the designated file, creating the full path if needed;
- we use https://test.rebex.net/ which provides several file transfer test servers, and which can be very handy to test your setup without having your final server available;
- -u demo:password is the regular interactive authentication process;
- funkier is the -vk hieroglyph: -v for verbose, -k for insecure (because this is a test, do not use this in production).

Tip

You may require the -p or --proxytunnel option in addition to the regular -x option if you are behind a proxy, to force the proxy to establish a SSL tunnel rather than to merely pass HTTP equivalent commands. This is required if you face Microsoft-like servers that issue HTML content or HTTP 302 on receiving HTTP requests instead of regular SFTP ones.

Once you’ve understood all of these lines, you can create a file named sftp-sample.yml and paste the content of the yaml inside it, and then fire up your cronjob like this:

oc create -f sftp-sample.yml

Then issue this command in your terminal and wait for the magic to happen:

oc get pods --watch

You’ll eventually witness some pods spawning:

NAME                         READY   STATUS            RESTARTS   AGE
sftp-1603289220-pm65k        0/1     Pending           0          <invalid>
sftp-1603289220-pm65k        0/1     Pending           0          <invalid>
sftp-1603289220-pm65k        0/1     Init:0/1          0          <invalid>
sftp-1603289220-pm65k        0/1     Init:0/1          0          <invalid>
sftp-1603289220-pm65k        0/1     PodInitializing   0          <invalid>
sftp-1603289220-pm65k        0/1     Completed         0          <invalid>

You can check that everything went well on the transfer side (use -c option to get the logs from the specified container):

oc logs sftp-1603289220-pm65k -c sftp-transfer

Which should show you the trick (remember the -v option):

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 195.144.107.198:22...
* TCP_NODELAY set
* Connected to test.rebex.net (195.144.107.198) port 22 (#0)
* User: demo
* Authentication using SSH public key file
  0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--     0* completed keyboard interactive authentication
* Authentication complete
{ [405 bytes data]
100   405  100   405    0     0    237      0  0:00:01  0:00:01 --:--:--   237
100   405  100   405    0     0    237      0  0:00:01  0:00:01 --:--:--   237
* Connection #0 to host test.rebex.net left intact

The same goes on the job side:

oc logs sftp-1603289220-pm65k -c sftp-job

And TADAAA, your file is there in all it’s glory:

Welcome,

you are connected to an FTP or SFTP server used for testing purposes by Rebex FTP/SSL or Rebex SFTP sample code.
Only read access is allowed and the FTP download speed is limited to 16KBps.

For information about Rebex FTP/SSL, Rebex SFTP and other Rebex .NET components, please visit our website at http://www.rebex.net/

For feedback and support, contact support@rebex.net

Thanks!

When you’re done, don’t forget to delete everything:

oc delete cronjobs,jobs,pvc -l application=sftp-job

Wrap up and final considerations

That’s it, we deployed our first batch in Openshift, and are able to schedule, pause or relaunch it on demand. We’ve even put that into practice to download a file over a SFTP connection.

In practice, I personally use Cronjobs in production for a lot of Admin processes in the sense of the Twelve Factors (reference) ; This ranges from database backup to reports generation, but the scope that you can cover is quite large.

Beware though, Cronjobs are no silver bullet. To deploy batches on Openshift, you have to first assess wether your batch is able to run safely on Openshift or not, regarding resources consumption, statelessness and resilience.

You can have a good overview of those concerns in this article from my friend Alexandre Touret.

Then again, Cronjobs are well fitted for small, one-shot standalone processes. If you require a more complex workflow, you should rather consider things like Volcano or even a good old quartz scheduler.

There is still plenty of things we can investigate with Cronjobs, but we will leave it for later, so stay tuned!