We compute the luminosity function (LF) and the formation rate of long gamma ray bursts (GRBs) by fitting the observed differential peak flux distribution obtained by the BATSE experiment in two different scenarios: i) the GRB luminosity evolves with redshift and ii) GRBs form preferentially in low-metallicity environments. In both cases, model predictions are consistent with the Swift number counts and with the number of detections at z>2.5 and z>3.5. To discriminate between the two evolutionary scenarios, we compare the model results with the number of luminous bursts (i.e. with isotropic peak luminosity in excess of 10^53 erg s^-1) detected by Swift in its first three years of mission. Our sample conservatively contains only bursts with good redshift determination and measured peak energy. We find that pure luminosity evolution models can account for the number of sure identifications. In the case of a pure density evolution scenario, models with Z_th>0.3 Zsun are ruled out with high confidence. For lower metallicity thresholds, the model results are still statistically consistent with available lower limits. However, many factors can increase the discrepancy between model results and data, indicating that some luminosity evolution in the GRB LF may be needed also for such low values of Z_th. Finally, using these new constraints, we derive robust upper limits on the bright-end of the GRB LF, showing that this cannot be steeper than ~2.6.