How do you convert mixed-text fields like revenue into a numerical range?

Converting mixed-text fields into numeric ranges means standardizing inconsistent values, like "$1M-$5M," "1000000-5000000," "under $10 million," and "10M+," into consistent numeric bounds you can segment, score, and route on. A one-time Python or regex script handles the easy cases, but real B2B data resists clean parsing because the format variants are endless and include vague labels like "mid-market" alongside hard numbers. Reliable normalization needs rules that map every variant, including non-numeric labels, to a defined range, and that run continuously as new data arrives. The Openprise RevOps Data Automation (RDA) Cloud provides a no-code way to apply that logic at scale.

Data normalization: converting mixed-text to numbers

Revenue data should be simple.

A company makes $10 million in revenue, so the revenue field says 10000000, your scoring model knows where it belongs, sales gets the right account priority, and everyone moves on with their day.

Cute idea. Very optimistic.

In the real world, revenue fields show up looking like a group project where nobody read the instructions. One enrichment vendor sends $1M-$5M. Another sends 1000000-5000000. A form fill says under $10 million. A list import says 10M+. Someone’s spreadsheet says Over $10 million, because apparently capitalization needed to be part of the adventure.

And revenue is just one example. The same issue shows up with employee count, company size, asset values, contract ranges, usage bands, and any other field where the data should be numeric but arrives as a mix of numbers, symbols, ranges, and words.

That’s where data normalization comes in.

For RevOps and marketing ops teams, normalizing mixed-text fields into numerical ranges is what makes the data usable for segmentation, scoring, routing, territory planning, reporting, and automation. It turns “kind of understandable to a human” into “reliable enough for systems to act on.”

And that second part matters. A lot.

If your CRM has five different ways to say the same revenue range, your automation will treat them like five different things. Your scoring logic gets messy. Your routing rules break. Your reports become a little too “interpretive dance.” And your team starts making decisions based on data that technically exists, but is not actually ready to use.

The good news: you do not need to write custom scripts every time this happens. With Openprise, you can convert mixed-text data into clean numerical ranges using a no-code, repeatable workflow that runs as part of your broader data cleansing and standardization process.

Here’s how it works.

Why mixed-text fields are so hard to use

Mixed-text data is exactly what it sounds like: a field that should contain clean, structured values but instead contains a mix of text, numbers, punctuation, symbols, and business shorthand.

For a revenue field, that might include values like:

$10,000,000
10 million
Over $10 million
Under $1 million
$10,000,000-50,000,000
10M+
1-5 million

A person can usually read those and understand the general meaning. Your CRM, MAP, scoring model, routing workflow, and reporting tools are not as forgiving.

To a system, $10,000,000, 10 million, and 10000000 are not automatically the same thing. One is text with a symbol and commas. One is text with a word. One is a number. If you want automation to work consistently, those values need to be cleaned, standardized, converted to a numeric format, and mapped into defined ranges.

That is the core idea behind normalization: take many messy versions of a value and turn them into one consistent format your systems can trust.

This is especially important when data comes from multiple sources. If you are using enrichment vendors, web forms, event lists, partner files, purchased lists, manual uploads, and CRM imports, each source may use a different format. That is normal. Annoying, but normal.

The trick is not expecting every source to magically behave. The trick is building a normalization process that can handle the mess automatically.

That is where a data orchestration approach helps. Instead of fixing fields manually or building one-off scripts, you create a repeatable process that cleans, transforms, validates, and writes back standardized data across your GTM systems.

How to turn mixed text into numerical ranges

For this example, let’s use a Revenue field.

The starting field is text-based and contains a mix of formats. Some values include dollar signs. Some include commas. Some include words like under, over, or million. Some contain ranges, such as $10,000,000-50,000,000.

The goal is to create clean output fields your business can use, such as:

OP Revenue Number
OP Revenue Range
OP Revenue Segment

The exact field names are up to you. The important part is that the final values are standardized and usable.

For example, a messy value like Over $10 million might eventually map to:

OP Revenue Number: 10000001
OP Revenue Range: $10M-$50M
OP Revenue Segment: Mid-market

Or a value like $10,000,000-50,000,000 might map to:

OP Revenue Number: 25000000
OP Revenue Range: $10M-$50M
OP Revenue Segment: Mid-market

Your range definitions may be different. That is the point. Openprise lets you configure this logic based on your business rules, not someone else’s default taxonomy.

Step 1: remove symbols and standardize common text

The first step is to clean the obvious formatting issues.

In revenue data, the usual suspects are dollar signs, commas, and words like million. These characters make the field easier for a human to read, but harder for systems to evaluate as a number.

Using the Search and Replace Text task template in Openprise, you can remove or replace those values automatically.

For example:

Replace $ with a blank value
Replace , with a blank value
Replace million with 000000
Replace million with 000000

That last version includes the space before the word million, which helps clean up values like 10 million without leaving extra spacing behind.

After this step, many values that used to look like text now look much closer to numbers.

$10,000,000 becomes 10000000
10 million becomes 10000000
$50,000,000 becomes 50000000

This is a big improvement, but you are not done yet. Some values still contain words or range indicators, such as under, over, or -. Those values need additional handling before they can become clean numbers.

Step 2: infer values for ranges and text-based exceptions

Once the easy formatting issues are removed, you still need to handle values that are not directly numeric.

For example:

Under $1 million
Over $10 million
$10 million-$50 million

These values describe a number or range, but they do not give you one clean number to work with.

In Openprise, you can handle this using a reference data source and the Infer Value task template. This lets you define how specific text patterns should map to a normalized numeric value.

For example, you might decide:

Under $1 million maps to 999999
Over $10 million maps to 10000001
$10 million-$50 million maps to 25000000

That middle value for $10 million-$50 million is a business choice. In this example, we are using the midpoint. But you could choose the lower bound, the upper bound, or another value based on how your segmentation and scoring models work.

This is one of the most important parts of the process: normalization is not just a technical cleanup exercise. It is where your business logic becomes operational.

For some companies, “over $10 million” should map to the next dollar above $10 million. For others, it might map to the start of a revenue band. For others, it might trigger a segment instead of a specific number. The right answer depends on how your team uses the data.

With Openprise, that logic lives in a configurable data source rather than buried in a script. So when your revenue bands change, you update the reference table instead of asking someone to rewrite code.

This is especially useful for teams managing enrichment data from multiple vendors. If one provider returns revenue as a range and another returns revenue as a specific value, Openprise can standardize the outputs before the data is used downstream. That is also why normalization pairs so well with multi-vendor data enrichment, where different providers may return similar data in very different formats.

Step 3: convert the cleaned text into a number

At this point, the values may look like numbers, but they may still technically be stored as text.

That distinction matters.

A text value of 10000000 may look right in a table, but it cannot be reliably used for numeric range assignment until it is converted into a numeric attribute type.

Using the Change Attribute Type task template, you can convert the cleaned text field into a number field.

For revenue and employee count fields, use whole number as the attribute type. You do not need decimals for revenue ranges in this context. We are not calculating cents here. We are creating clean numeric values that can be placed into business ranges.

Once the field is converted to a whole number, Openprise can evaluate it numerically. That means the value can be compared against minimum and maximum thresholds in a range table.

This is the point where the field becomes truly useful for automation.

Step 4: assign the number to a range or segment

Now that the values are numeric, you can assign them to ranges.

To do this, create a data source that defines the range logic. At a minimum, this table should include columns like:

Min
Max
Revenue Range

You may also include a segment label, such as:

Revenue Segment

For example:

Min	Max	Revenue Range	Revenue Segment
0	999,999	Under $1M	SMB
1,000,000	9,999,999	$1M–$10M	Emerging
10,000,000	49,999,999	$10M–$50M	Mid-market
50,000,000	499,999,999	$50M–$500M	Enterprise
500,000,000	999,999,999,999	$500M+	Strategic

When setting up the data source, make sure the Min and Max columns are imported as whole numbers. That way, Openprise can compare the cleaned numeric revenue value against the correct thresholds.

Then use the Assign Value to Range task template to map each numeric revenue value to the appropriate range.

You can create one output, such as Revenue Range, or multiple outputs, such as both Revenue Range and Revenue Segment. It depends on how your business uses the data.

Some teams want a range for reporting. Others want a segment for routing or scoring. Many want both.

For example:

25000000 maps to $10M-$50M
25000000 also maps to Mid-market

That gives you clean, consistent values that sales, marketing, customer success, and RevOps can actually use.

Step 5: validate the output

Please do not skip this step. Future you deserves better.

After assigning values to ranges, check your outputs to make sure the results match what you expected. Look for records where the original Revenue value exists but the new Revenue Range or Revenue Segment is blank.

Those blanks are your unmatched values.

They usually mean one of three things happened:

The original value used a format your workflow does not handle yet.
The inferred value table needs another mapping.
The range table is missing a min/max threshold for that value.

When you find unmatched values, update your task logic or reference data source so the workflow can handle them next time.

This is how normalization gets stronger over time. You do not need to predict every possible messy value on day one. You need a process that lets you identify exceptions, add logic, and keep improving without starting from scratch.

Why this should not be a one-time cleanup project

You could normalize a revenue field once in a spreadsheet. You could write a quick script. You could ask someone on the team to “just clean it up before importing.”

And for one list, maybe that works.

But GTM data does not stay clean. New records are created every day. Enrichment vendors update fields. Sales reps make edits. Forms collect new values. Event lists show up with their own formatting. Integrations sync data from one system to another, sometimes politely and sometimes like a raccoon got into the pantry.

That is why mixed-text normalization should be part of an automated data process, not a one-time cleanup effort.

With Openprise, you can build this workflow once, then run it continuously as new data enters your systems. That means revenue and employee count values can be standardized before they affect segmentation, routing, scoring, reporting, or downstream workflows.

It also means your GTM systems work better together. Clean, standardized fields make it easier to keep Salesforce, Marketo, HubSpot, enrichment vendors, data warehouses, and other tools aligned. If system syncs and field-level logic are already creating headaches, it may be time to look at your broader GTM system integration strategy too.

Where this fits in your larger GTM data quality strategy

Converting mixed-text fields into numerical ranges is one small workflow. But it represents a much bigger principle: your GTM data needs to be structured enough for automation to trust it.

Revenue range normalization can improve:

Lead scoring
Account segmentation
Territory planning
Campaign targeting
ICP modeling
Sales routing
Executive reporting
AI workflow performance

The same approach can apply to other fields too, including employee count, job level, industry, company size, and other firmographic or operational fields that arrive in inconsistent formats.

This is the kind of work that often gets dismissed as “data cleanup,” but it is actually foundational. Badly formatted data does not just make reports ugly. It breaks the logic your GTM motion depends on.

And as more teams add AI into their workflows, this becomes even more important. AI tools perform better when the data feeding them is clean, structured, and consistent. Otherwise, you end up asking AI to interpret messy fields over and over again, which adds cost, inconsistency, and avoidable risk.

Openprise gives RevOps teams a no-code way to clean, standardize, enrich, segment, and activate GTM data across the stack. The result is not just prettier fields. It is data your systems can actually use.

How to make your mixed-text data usable

Mixed-text fields are common. Living with them forever is optional.

With Openprise, you can convert messy revenue or employee count values into clean numerical ranges using a repeatable no-code workflow:

Clean the symbols and common text.
Infer values for ranges and exceptions.
Convert the result into a whole number.
Assign the number to a range or segment.
Validate the output and catch anything unmatched.

That gives your team consistent values for segmentation, scoring, routing, reporting, and every other workflow that depends on trustworthy data.

And once the process is built, it can keep running as new records enter your systems. No spreadsheet gymnastics. No brittle one-off scripts. No “who imported this list and why does revenue say ‘big-ish’?”

Just clean, standardized GTM data that is ready to work.

Want to see how Openprise handles data normalization across your GTM stack? Talk to our team.