NLP : Transfer Learning using TFHub Bert Model

Prasad
2 min readMar 29, 2022

--

For programmers in Artificial Intelligence the challenge of training a industry relevant model from scratch is enormous. This is mainly due to the lack of cheaper compute resources. This poses a great challenge for developers with less compute resources. To come up with competent models which can solve problems in Natural Language Processing, Computer Vision, Recommendation systems etc. huge compute resource is required. In general, training a model needs huge amount of compute resources, which is not cheaper to come by.

Googles TFHub (http://tfhub.dev) provides AI developers with models which Goolge had pretrained . Developers can finetune these models to cater to their specific needs. Thus the need for training a model from scratch can be avoided. This involves fine tuning the model with a dataset closely resembling the use case for which the models where originally trained by google. For example a BERT based NLP model trained using the popular Wikipedia dataset can be further fine tuned for specific use case where the need is to classify the questionnaires found on Quora website (https://www.quora.com/).

The finetuning workflow can be pictorially represented as follows:

Lets take each block separately and dig deeper.

TFHub (http:://tfhub.dev) model is a vast repository of pretrained models Google had provided for the perusal of model developers. Using a simple python script we can download and use the model for our consumption. In the above picture the first block represents this part

Once the model is downloaded we have to finetune or retrain the model using the new dataset. So we need to download the dataset and then preprocess the dataset. The preprocessing part is done to ensure that the data is in a trainable format. Once the sanctity of data is ensured this data is loaded for retraining the model. In the picture the middle three blocks represent these steps.

Once the model is retrained your model is ready for inference , for the new use case. The last right side block in the above picture represents this step.

All the above steps are summarized in the python script below:

--

--