Revenue data is one of the messiest fields in any CRM or marketing database — and one of the most important to get right. It might arrive as $1M-$5M from one enrichment vendor, 1000000-5000000 from an API, "under $10 million" from a form fill, or "10M+" from a list import. Before you can use any of it for segmentation, scoring, routing, or reporting, you need all of it normalized into consistent numeric bounds.
This is a data normalization problem, and it's one that comes up constantly for ops teams managing firmographic data at scale. In this post, we'll walk through exactly how to solve it: first by explaining why the obvious approaches tend to break down, then by showing the complete no-code solution available in the Openprise RevOps Data Automation (RDA) Cloud.
Why this is harder than it looks
If you're trying to parse revenue ranges from text in a one-time project, a Python script using regex or pandas can get you most of the way there. Something like this handles the easy cases:
python
import re
def parse_revenue_range(text):
text = text.lower().replace('$', '').replace(',', '')
text = re.sub(r'\s*million', '000000', text)
match = re.search(r'(\d+)\s*[-–to]+\s*(\d+)', text)
if match:
return int(match.group(1)), int(match.group(2))
return None
That handles "1-5 million" or "$10,000,000-50,000,000" reasonably well. But in practice, revenue fields in B2B databases resist clean parsing for several reasons:
The format variants are endless. Real data includes "over $10M", "<$1M", "10 to 50 million", "mid-market", "1B+", and values that are just blank or null. Each variant requires its own handling logic, and new variants appear whenever you onboard a new data source or vendor.
Range definitions change. Your scoring model might define "mid-market" as $50M–$500M today and $25M–$250M next quarter. A hardcoded script needs to be redeployed every time your business logic updates.
It doesn't run continuously. A script you run once doesn't keep your CRM clean as new records flow in from enrichment vendors, form fills, or list imports. You need the normalization logic to be part of a repeatable, automated process — not a manual step someone remembers to run.
It's one field among many. Revenue range is just one of dozens of fields that need this treatment: employee count, job level, industry codes, phone numbers, and more. Building and maintaining individual scripts for each one isn't sustainable.
This is where a dedicated data normalization layer — one that's built into your RevOps stack and runs continuously — makes more sense than point-in-time code.
How Openprise approaches data normalization for mixed-format fields
The Openprise RDA Cloud handles data normalization like this as a first-class, no-code operation. Rather than writing and maintaining transformation scripts, you configure reusable task templates that run automatically whenever records enter or update in your system.
The process for converting a mixed-text revenue field into clean numeric bounds involves four steps, each using a native task template. The end result is a standardized Revenue Range field (e.g., $1M–$5M) and optionally a Revenue Segment label (e.g., Mid-Market) that your scoring, routing, and reporting logic can consume reliably.
Here's how it works.
How Openprise RDA Cloud converts mixed-text fields
In this example, we'll start with a field with values that should contain numbers but has a mix of text and numbers instead — which is very common with fields that deal with revenue or the number of employees. The goal is to provide a numerical range, so the data must be cleansed and standardized, converted to numbers, and then placed in the ranges.
Let's start with a Revenue field that has a text format. Data could appear in multiple formats as in the screenshot below, such as with or without $, with or without , and with words such as under, over, or million. There might also be values like $10,000,000-50,000,000.
[SCREENSHOT: Revenue field showing varied text formats — to be updated]
Step one
Because the numbers we want to look at have characters such as $ and , the first step is to remove these characters.
- Using the Search and replace text task template, we replace the $ and , with a blank and replace the word million with six zeroes to make it look like a number. This can be done in two actions in the same task.
- Action 1 removes the $ and the ,.
[SCREENSHOT: Search and replace text task — Action 1 — to be updated]
- Action 2 replaces million with 000000. One of these has a space before it, so we can replace the space as well as the word.
[SCREENSHOT: Search and replace text task — Action 2 — to be updated]
- Now we can compare the data we started with to the data from our outputs:
[SCREENSHOT: Output comparison table — to be updated]
Most of the data now looks like a whole number, without any symbols, but there are still some values that aren't numbers:
[SCREENSHOT: Remaining non-numeric values — to be updated]
Step two
To assign these, we can create a data source to use with an Infer value task template:
[SCREENSHOT: Data source for Infer value task — to be updated]
[SCREENSHOT: Infer value table — to be updated]
For values like "$10 million-$50 million," we chose in this example to take a value in the middle, 25000000, but selecting the value you want to use would be at your discretion. You could also choose something like 50000000 if you wanted the highest possible value, or 10000000 if you wanted the lowest. Similarly, for "Over $10 million" we chose to use just one dollar over that, but the user could select the value that they felt most accurately represented what they know from this field.
- We can then use a filter to include only records that contain over, ,, under, or - for this task and then infer the value we want from this data source:
[SCREENSHOT: Infer value task with filter — to be updated]
- Now all the records are in the same text format that contains only numerals in the OP Revenue (mils removed) field:
[SCREENSHOT: Output table showing all-numeric values — to be updated]
Step three
From here, we can use the Change attribute type task template to convert the text numerals to a number.
Remember to use whole number as the attribute type when creating the new attribute, because only numbers can be put in numerical ranges; real numbers are unnecessary, because you don't need to worry about cents here.
[SCREENSHOT: Change attribute type task — to be updated]
Step four
Now that we have numbers, we can assign them to ranges.
- First, we'll need a data source that assigns ranges.
- As with the previous step, when setting up the data source, we'll need to make sure our "Min" and "Max" columns are imported as whole numbers.
[SCREENSHOT: Data source for Assign value to range task — to be updated]
[SCREENSHOT: Assign value to range output table — to be updated]
- Now we can assign the ranges using the Assign value to range task template. In the above example, we have both Revenue Range and Revenue Segment in two actions. As you can see by the 123 in the field descriptions, the formats for the "Max" and "Min" values are now numerical. The system gives us the flexibility to create one or both to accommodate any business requirement.
[SCREENSHOT: Assign value to range task — Revenue Range action — to be updated]
[SCREENSHOT: Assign value to range task — Revenue Range and Revenue Segment — to be updated]
- Here, we'll need to make sure to check the output to confirm we're getting the results we expected.
[SCREENSHOT: Final output table — to be updated]
Note that you could have a field like OP Revenue Range to show a number, or like OP Revenue Segment to show what kind of business that number indicates for your company. If these names or ranges changed in the future, you'd just need to update this table in the data source to update to the new ranges or values.
- We'll also need to check and validate to make sure none of the original revenue values are missing a Revenue Range or Revenue Segment.
[SCREENSHOT: Validation filter — to be updated]
- If there is an unmatched value at this point, we'll need to either adjust the tasks or create a new task to manage any data not already mapping. Here's what the final process looks like when we're finished.
[SCREENSHOT: Complete process flow — to be updated]
Revenue range normalization is one piece of a larger data quality picture
Those are the four steps to convert a mixed-text field into a clean numeric range — and to keep it clean automatically as new records flow into your system. The same no-code approach applies across the other firmographic fields that typically arrive in inconsistent formats: employee count, industry codes, phone numbers, job titles, and more.
Automated data normalization at this level is one of the core capabilities of the Openprise RDA Cloud. If your ops team is spending time on manual field cleanup — or if scoring, routing, and segmentation logic is breaking down because the underlying data isn't consistent — schedule a demo with one of our data consultants to see how it works across your full stack.
.jpg)















