One important thing in machine learning is feature engineering (selection & extraction). This means choosing the right variables that improve the model’s performance, while discarding those reducing it. The more impact your variables have on the performance metric, the better. Because the real world is complex, you may start with dozens or even hundreds of variables (=features), but in the end, you only want to keep the ones that improve the model’s performance.
While there are algorithms, such as information gain, to help, expert judgment can be of help as well. That’s because experts may have prior information on the important inputs. Therefore, one could interview industry insiders prior to creating a machine-learning model. Basically, the expert opinion narrows down the feature space. While this approach has risks, primarily foregoing hidden or non-obvious features, as well as potential expert biases, it also has obvious advantages in terms of distinguishing signal from noise.
So, the premise of narrowing down search space is the motivation for this article. I got to think, and do some rapid research, on what features matter for performance of Facebook advertising. These could be used as a basis for machine learning model e.g. to predict performance of a given ad.
A. Text features
- sentiment 
- wordCount 
- charCount 
- includesText 
- includesProduct 
- isDarkColorTheme 
E. Misc features
A simple model could only account for C (=independent and dependent variables) and D (independent variables), while more complex models would run a more complex analysis of text and images using linear or non-linear optimization, such as neural networks (shallow or deep learning). Also, some of these features could be retrieved by using commercial or public APIs. For example,
- Google Cloud Vision API – for image analysis 
- MonkeyLearn – for text analysis 
- EmojiNet API – for emoji analysis 
Ideally, each advertiser has his own model, because they may not generalize well (e.g., different advertisers have different target groups). However, feature selection may benefit from learning from earlier experiences. Also, given that there is enough data, it may be possible that the model learns which features apply across different advertisers, achieving a greater degree of generalizability.