There has been growing interest in facial capture and animation in recent years. Visual effects companies and video game companies want to create and deliver new experiences to their audiences, whether they are a hyperrealistic human character or a fantastic creature created by visual effects and driven by a human interpreter. With rapidly growing technology and the search for more efficient ways to deliver high-quality animation, and the increasing visual realism of performance using high-performance systems, the industry has driven a wave of innovative academic and industrial developments. As a result, facial animation research and development has sparked a huge intrigue in universities and entertainment industry conglomerates – the central aspect to helping many of these developments being technical advancements in vision. by computer and graphics.
Some of the particular advancements noted are face tracking and motion capture, face tuning, dense non-rigid mesh registration, measurement of skin rendering attributes through various models, and sensing technology. Photorealistic synthesis and advanced facial editing of portrait images find many applications in several areas of animation, including special effects, extended reality, virtual worlds and next-generation communication. An artist’s control over the semantic parameters of facial capture technology such as geometric identity, expressions, reflectance, or stage lighting is typically desired during the content creation process. Image synthesis has also made huge strides with the emergence of Generative Adversary Networks (GANs). Essentially, GANs discover the domain from the image and produce new samples from the same distribution.
For such use cases, StyleGAN is one of the most popular choices for this task. Not only does it help achieve cutting edge visuals, but it also demonstrates fantastic editing abilities from its unraveled organically formed latent space. Using this property of image synthesis, there are many methods that can be adjusted to StyleGAN’s realistic latent space editing capabilities and help to perform different types of image editing techniques such as change of image. facial, expression or age orientations, going through the learned characteristics. In addition, the use of synthetic image data as additional training data has been found to be useful even though these are images rendered graphically with a wide range of applications such as facial reconstruction, the estimation of the gaze, the human pose, the estimation of the form.
Data & Analytics Conclave. Free Recordings>>
What is the pivot adjustment for images?
Traditional advanced facial editing techniques currently in use take advantage of the generative power of a pre-trained StyleGAN. To successfully edit an image using StyleGAN, you must first invert the sample image in the field of the pre-trained generator. It turns out that StyleGAN’s latent space induces an inherent trade-off between image distortion and editable; for example, it oscillates between maintaining the original appearance and convincingly modifying some of its attributes. In practice, this means that it is always difficult to apply identity preserving facial editing techniques to faces that are outside the domain of the generator. Pivotal Image Tuning is one approach to bridge this gap. The Pivotal Image Tuning technique modifies the generator only slightly so that an out-of-domain image is correctly mapped to latent space in the domain. The key idea of ââpivot adjustment is to use a brief training process that preserves the edit quality of an image while altering its represented identity and appearance.
Using Pivotal Tuning Inversion, also known as PTI, an initial inverted latent pattern that serves as a pivot, the generator is fine-tuned. At the same time, a regularization method keeps close identities between images intact to maintain the effect. This training process ends up changing the appearance characteristics that primarily represent identity without affecting the model’s editing capabilities. The pivot adjustment can also adjust the generator to accommodate many faces while introducing negligible or no distortion to the rest of the image domain. It has been validated by reversing and editing metrics to show better performance compared to current advanced methods. Using the PTI method, advanced modifications such as pose, age or expression can be implemented on many images of well-known and recognizable identities. It withstands even more difficult cases, such as images that include heavy makeup, elaborate hairstyles or headwear. The method can reconstruct even out-of-domain visual details, such as face paintings or hands, much better than advanced methods.

Here we can see an example of real images edited using PTI’s custom Multi-ID StyleGAN. All of the images depicted here are generated by the same model, refined on the political and industrial leaders of the world. As can be seen, by applying various editing operations on these newly introduced and highly recognizable identities, the model still preserves the facial features well.

Using such a technique can be a boon to the image processing and photography industry, as every aspect required can be customized as needed without compromising on quality or having to re-image!

First steps with Pivotal Tuning for images
We will try to implement a pivot adjustment pattern on an image using the pivot adjustment inversion. The input image will be automatically modified with different editing measures such as applying inversion, aging, modifying or applying smile and modifying the pose of the image. This will show us the different capabilities of the PTI model and how effective editing techniques are in performing automatic edits without degrading the quality of the images. The creators of PTI are inspired in part by the following code, and their GitHub repository can be accessed from here.
Installation of libraries
The first step would be to install our required libraries. We will be using two initially named libraries, Weights & Biases, which will help us maintain our machine learning model and provide faster and more efficient visualization. The other is the lpips library, which will provide us with a set of perceptual similarity metrics to help train our model on different image perspectives. You can use the following code to install them
!pip install wandb !pip install lpips
Importing the StyleGan model,
!wget https://github.com/ninja-build/ninja/releases/download/v1.8.2/ninja-linux.zip !sudo unzip ninja-linux.zip -d /usr/local/bin/ !sudo update-alternatives --install /usr/bin/ninja ninja /usr/local/bin/ninja 1 --force Importing the PTI Model Weights, import os os.chdir('/content') CODE_DIR = 'PTI' !git clone https://github.com/danielroich/PTI.git $CODE_DIR
Installation of dependencies and pre-trained models
Next we will install all of our other dependencies required to create our model, here we also use Pytorch and call PTI’s image editing functions.
import os import sys import pickle import numpy as np from PIL import Image import torch from configs import paths_config, hyperparameters, global_config from utils.align_data import pre_process_images from scripts.run_pti import run_PTI from IPython.display import display import matplotlib.pyplot as plt from scripts.latent_editor_wrapper import LatentEditorWrapper
Import of pre-trained weights for our StyleGAN,
## Download pretrained StyleGAN on FFHQ 1024x1024 downloader.download_file("125OG7SMkXI-Kf2aqiwLLHyCvSW-gZk3M", os.path.join(save_path, 'ffhq.pkl')) Downloading the Dlib tool for setting alignment and preprocessing images before feeding input to the PTI, #Download Dlib tool for alingment downloader.download_file("1xPmn19T6Bdd-_RfCVlgNBbfYoh1muYxR", os.path.join(save_path, 'align.dat'))
Configuring the model configuration
Now we are going to organize and configure our model from anything that has been called or imported before.
image_dir_name="image" # If set to true download the desired image from the given url.If set to False, assumes you have uploaded personal image to # 'image_original' dir use_image_online = True image_name="personal_image" use_multi_id_training = False global_config.device="cuda" paths_config.e4e="/content/PTI/pretrained_models/e4e_ffhq_encode.pt" paths_config.input_data_id = image_dir_name paths_config.input_data_path = f'/content/PTI/{image_dir_name}_processed' paths_config.stylegan2_ada_ffhq = '/content/PTI/pretrained_models/ffhq.pkl' paths_config.checkpoints_dir="/content/PTI/" paths_config.style_clip_pretrained_mappers="/content/PTI/pretrained_models" hyperparameters.use_locality_regularization = False
Image preprocessing
We are now going to perform the first steps to preprocess our input image; we will correct the alignment of the input image before feeding it to the input of our model for better results. Here I am using an image of Serena Williams to adjust and correct the alignment of the head.
## Download real face image ## If you want to use your own image skip this part and upload an image/images of your choosing to image_original dir if use_image_online: Â Â !wget -O personal_image.jpg https://static01.nyt.com/images/2019/09/09/opinion/09Hunter1/09Hunter1-superJumbo.jpg ## Photo of Sarena Wiliams original_image = Image.open(f'{image_name}.jpg')#open original image original_image

pre_process_images(f'/content/PTI/{image_dir_name}_original') aligned_image = Image.open(f'/content/PTI/{image_dir_name}_processed/{image_name}.jpeg') aligned_image.resize((512,512)) #setting alignmentÂ
Corrected image:

Invert images using PTI
Formation of our created model
os.chdir('/content/PTI') model_id = run_PTI(use_wandb=False, use_multi_id_training=use_multi_id_training) Visualizing our results, def display_alongside_source_image(images):     res = np.concatenate([np.array(image) for image in images], axis=1)     return Image.fromarray(res) #loading image generator def load_generators(model_id, image_name):   with open(paths_config.stylegan2_ada_ffhq, 'rb') as f:     old_G = pickle.load(f)['G_ema'].cuda()   with open(f'{paths_config.checkpoints_dir}/model_{model_id}_{image_name}.pt', 'rb') as f_new:     new_G = torch.load(f_new).cuda()   return old_G, new_G generator_type = paths_config.multi_id_model_type if use_multi_id_training else image_name old_G, new_G = load_generators(model_id, generator_type) # performing image synthesis def plot_syn_images(syn_images):   for img in syn_images:       img = (img.permute(0, 2, 3, 1) * 127.5 + 128).clamp(0, 255).to(torch.uint8).detach().cpu().numpy()[0]       plt.axis('off')       resized_image = Image.fromarray(img,mode="RGB").resize((256,256))       display(resized_image)       del img       del resized_image       torch.cuda.empty_cache() w_path_dir = f'{paths_config.embedding_base_dir}/{paths_config.input_data_id}' embedding_dir = f'{w_path_dir}/{paths_config.pti_results_keyword}/{image_name}' w_pivot = torch.load(f'{embedding_dir}/0.pt' old_image = old_G.synthesis(w_pivot, noise_mode="const", force_fp32 = True) new_image = new_G.synthesis(w_pivot, noise_mode="const", force_fp32 = True) print('Lower image is the inversion before Pivotal Tuning and the upper image is the product of pivotal tuning') plot_syn_images([old_image, new_image])

As we can see, with the help of PTI, a whole new face can be generated from an input image!
Performing different editing techniques on the image
We are now going to apply different editing measures such as applying an inversion, modifying or applying a smile and modifying the pose of the image.
latent_editor = LatentEditorWrapper() latents_after_edit = latent_editor.get_single_interface_gan_edits(w_pivot, [-2, 2]) for direction, factor_and_edit in latents_after_edit.items(): Â Â print(f'Showing {direction} change') Â Â for latent in factor_and_edit.values(): Â Â Â Â old_image = old_G.synthesis(latent, noise_mode="const", force_fp32 = True) Â Â Â Â new_image = new_G.synthesis(latent, noise_mode="const", force_fp32 = True) Â Â Â Â plot_syn_images([old_image, new_image])







End Notes
In this article, we have explored what exactly the PTI or Pivotal Tuning Inversion technique is. We also tried to get a better understanding and created a practical implementation for the model. Therefore, you can try it with more complex pictures and notice the difference in processing. The link to the above collaboration implementation can be accessed from here.
Good learning!
The references
Subscribe to our newsletter
Receive the latest updates and relevant offers by sharing your email.
Join our Telegram Group. Be part of an engaging community