Providing natural language instructions in prompts is a useful new paradigm for improving task performance of large language models in a zero-shot setting. Recent work has aimed to improve such prompts via manual rewriting or gradient-based tuning. However, manual rewriting is time-consuming and requires subjective interpretation, while gradient-based tuning can be extremely computationally demanding for large models and requires full access to model weights, which may not be available for API-based models. In this work, we introduce Gradient-free Instructional Prompt Search (GrIPS), a gradient-free, edit-based search approach for improving task instructions for large language models. GrIPS takes in instructions designed for humans and automatically returns an improved, edited prompt, while allowing for API-based tuning. The instructions in our search are iteratively edited using four operations (delete, add, swap, paraphrase) on text at the phrase-level. With InstructGPT models, GrIPS improves the average task performance by up to 4.30 percentage points on eight classification tasks from the Natural-Instructions dataset. We see improvements for both instruction-only prompts and for k-shot example+instruction prompts. Notably, GrIPS outperforms manual rewriting following the guidelines in Mishra et al. (2022) and also outperforms purely example-based prompts while controlling for the available compute and data budget. Lastly, we provide qualitative analysis of the edited instructions across several scales of GPT models.