Abstract We have seen significant improvements in machine translation due to the usage of deep learning. While the improvements in translation quality are impressive, the encoder-decoder architecture enables many more possibilities. In this talk, I will present methods to integrate additional constraints in the model: length-constraints to display the output in a given format as well as time-constraints in low-latency speech translation. We will focus on methods that learn these additional constraints in an unsupervised way. So the model is able to make use of the standard parallel training data for machine translation. In order to achieve this, multiple components of the development of a machine translation system have to be addressed. In this talk, we will cover the adaptation of the learning, the architecture and the inference. By applying these methods, we are able to generate shorter translations that directly summarize the source sentence. In a study, we will compare these to human subtitles used in TV. By combining the methods with multi-lingual machine translation, we are also able to perform monolingual shortening. For low-latency translation, we show that the latency can be significantly reduced with only minimal losses in translation quality.